openwrt/staging/blogic.git
11 years agosfc: Add GFP flags to efx_nic_alloc_buffer() and make most callers allow blocking
Ben Hutchings [Tue, 18 Sep 2012 20:59:52 +0000 (21:59 +0100)]
sfc: Add GFP flags to efx_nic_alloc_buffer() and make most callers allow blocking

Most call sites for efx_nic_alloc_buffer() are part of the probe or
reconfiguration paths and can allocate with GFP_KERNEL.  A few others
should use GFP_NOIO (I think).  Only one is in atomic context and
must use the current GFP_ATOMIC.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Make MCDI independent of Siena
Ben Hutchings [Tue, 18 Sep 2012 01:33:56 +0000 (02:33 +0100)]
sfc: Make MCDI independent of Siena

Move the lowest layer (transport) of the current MCDI code to
per-NIC-type operations.

Introduce a new structure and efx_nic member for MCDI-specific data.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Make efx_mcdi_init() call efx_mcdi_handle_assertion()
Ben Hutchings [Tue, 18 Sep 2012 01:33:55 +0000 (02:33 +0100)]
sfc: Make efx_mcdi_init() call efx_mcdi_handle_assertion()

This should probably be done during MCDI initialisation for any NIC.
Change efx_mcdi_init() to return an error code.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Collect all MCDI port functions into mcdi_port.c
Ben Hutchings [Tue, 18 Sep 2012 01:33:54 +0000 (02:33 +0100)]
sfc: Collect all MCDI port functions into mcdi_port.c

Collect together MCDI port functions from mcdi.c, mcdi_mac.c,
mcdi_phy.c and siena.c.  Rename the 'siena' functions accordingly.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Move efx_mcdi_mac_reconfigure() to siena.c and rename
Ben Hutchings [Mon, 8 Oct 2012 15:56:18 +0000 (16:56 +0100)]
sfc: Move efx_mcdi_mac_reconfigure() to siena.c and rename

EF10 does not include a multicast hash filter, so this function is
specific to Siena.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Move siena_reset_hw() and siena_map_reset_reason() into MCDI module
Ben Hutchings [Tue, 18 Sep 2012 01:33:52 +0000 (02:33 +0100)]
sfc: Move siena_reset_hw() and siena_map_reset_reason() into MCDI module

These implementations should work for EF10 too.  Rename them
accordingly.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Add and use MCDI_SET_QWORD() and MCDI_SET_ARRAY_QWORD()
Ben Hutchings [Wed, 10 Oct 2012 22:20:17 +0000 (23:20 +0100)]
sfc: Add and use MCDI_SET_QWORD() and MCDI_SET_ARRAY_QWORD()

No need to keep open-coding the assignment of high and low dwords.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Ensure MCDI buffers, but not lengths, are dword aligned
Ben Hutchings [Fri, 14 Sep 2012 16:31:41 +0000 (17:31 +0100)]
sfc: Ensure MCDI buffers, but not lengths, are dword aligned

We currently require that MCDI request and response lengths are
multiples of 4 bytes, because we will copy dwords in and out of shared
memory and we want to be sure we won't read or write out of bounds.
But all we really need to know is that there is sufficient padding for
that.  Also, we should ensure that buffers are dword-aligned, as on
some architectures misaligned access will result in data corruption or
a crash.

Change the buffer type to array-of-efx_dword_t and remove the
requirement that the lengths are multiples of 4.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Use proper macros to declare and access MCDI arrays
Ben Hutchings [Fri, 14 Sep 2012 16:31:33 +0000 (17:31 +0100)]
sfc: Use proper macros to declare and access MCDI arrays

A few functions are using heap buffers; change them to use stack
buffers as we really don't need to resort to the heap for a 252
byte buffer in process context.

MC_CMD_MEMCPY is quite weird in that it can use inline data placed in
the request buffer after the array of records.  Thus there are two
variable-length arrays and we can't use the normal accessors for
the second.  So we have to use _MCDI_PTR() in efx_sriov_memcpy().

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Introduce and use MCDI_CTL_SDU_LEN_MAX_V1 macro for Siena-specific code
Ben Hutchings [Tue, 20 Aug 2013 14:47:12 +0000 (15:47 +0100)]
sfc: Introduce and use MCDI_CTL_SDU_LEN_MAX_V1 macro for Siena-specific code

The MCDI version 2 protocol supports larger payloads, but will
not be implemented on Siena.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Fill out the set of MCDI accessors
Ben Hutchings [Fri, 14 Sep 2012 16:30:58 +0000 (17:30 +0100)]
sfc: Fill out the set of MCDI accessors

We need to access arrays of 16-bit words and 32-bit dwords in MCDI
buffers based on the MCDI protocol definitions.

We should also be able to read and write fields within structures,
without specifying an array index each time.  So add MCDI_FIELD()
and make MCDI_ARRAY_FIELD() use it.  Also add MCDI_SET_FIELD().

Split MCDI_ARRAY_PTR() into MCDI_ARRAY_STRUCT_PTR() and
_MCDI_ARRAY_PTR(), which are currently identical but will diverge in
later changes.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Rationalise MCDI buffer accessors
Ben Hutchings [Fri, 14 Sep 2012 16:30:44 +0000 (17:30 +0100)]
sfc: Rationalise MCDI buffer accessors

Add _MCDI_DWORD() which yields an lvalue for the given dword field
and change MCDI_DWORD(), MCDI_SET_DWORD() and MCDI_QWORD() to use it.

Fold the rather trivial MCDI_PTR2() into MCDI_PTR() and _MCDI_DWORD().

Remove MCDI_SET_DWORD2() and MCDI_QWORD2().  MCDI_DWORD2() should also
go, but it still has one user which we'll get rid of later.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Introduce and use MCDI_DECLARE_BUF macro
Ben Hutchings [Fri, 14 Sep 2012 16:30:10 +0000 (17:30 +0100)]
sfc: Introduce and use MCDI_DECLARE_BUF macro

MCDI_DECLARE_BUF declares a variable as an MCDI buffer of the
requested length, adding any necessary padding.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Move more Falcon-specific code and definitions into falcon.c
Ben Hutchings [Thu, 13 Sep 2012 00:11:31 +0000 (01:11 +0100)]
sfc: Move more Falcon-specific code and definitions into falcon.c

In particular, fold in the whole of falcon_xmac.c.

Drop some entirely unused definitions.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Move details of a Falcon bug workaround out of ethtool.c
Ben Hutchings [Thu, 13 Sep 2012 00:11:25 +0000 (01:11 +0100)]
sfc: Move details of a Falcon bug workaround out of ethtool.c

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: Use efx_mcdi_mon() to find efx_mcdi_mon structure from efx_nic
Ben Hutchings [Thu, 13 Sep 2012 00:11:24 +0000 (01:11 +0100)]
sfc: Use efx_mcdi_mon() to find efx_mcdi_mon structure from efx_nic

This needs to be done before we separate MCDI from siena_nic_data.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agosfc: const-qualify source pointers for MMIO write functions
Ben Hutchings [Thu, 13 Sep 2012 00:11:23 +0000 (01:11 +0100)]
sfc: const-qualify source pointers for MMIO write functions

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agoMerge branch 'sfc-3.11'
Ben Hutchings [Wed, 21 Aug 2013 13:19:36 +0000 (14:19 +0100)]
Merge branch 'sfc-3.11'

Merge sfc fixes destined for 3.11 so we can avoid conflicts with
development for 3.12+.

11 years agosfc: Fix lookup of default RX MAC filters when steered using ethtool
Ben Hutchings [Tue, 9 Jul 2013 16:12:49 +0000 (17:12 +0100)]
sfc: Fix lookup of default RX MAC filters when steered using ethtool

commit 385904f819e3 ('sfc: Don't use
efx_filter_{build,hash,increment}() for default MAC filters') used the
wrong name to find the index of default RX MAC filters at insertion/
update time.  This could result in memory corruption and would in any
case silently fail to update the filter.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
11 years agoMerge branch 'openvswitch_vxlan'
David S. Miller [Tue, 20 Aug 2013 07:16:47 +0000 (00:16 -0700)]
Merge branch 'openvswitch_vxlan'

Pravin B Shelar says:

====================
openvswitch: VXLAN tunneling.

First four vxlan patches extends vxlan so that openvswitch
can share vxlan recv code. Rest of patches refactors vxlan
data plane so that ovs can share that code with vxlan module.
Last patch adds vxlan-vport to openvswitch.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoopenvswitch: Add vxlan tunneling support.
Pravin B Shelar [Mon, 19 Aug 2013 18:23:34 +0000 (11:23 -0700)]
openvswitch: Add vxlan tunneling support.

Following patch adds vxlan vport type for openvswitch using
vxlan api. So now there is vxlan dependency for openvswitch.

CC: Jesse Gross <jesse@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agovxlan: Add tx-vlan offload support.
Pravin B Shelar [Mon, 19 Aug 2013 18:23:29 +0000 (11:23 -0700)]
vxlan: Add tx-vlan offload support.

Following patch allows transmit side vlan offload for vxlan
devices.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agovxlan: Improve vxlan headroom calculation.
Pravin B Shelar [Mon, 19 Aug 2013 18:23:22 +0000 (11:23 -0700)]
vxlan: Improve vxlan headroom calculation.

Rather than having static headroom calculation, adjust headroom
according to target device.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agovxlan: Factor out vxlan send api.
Pravin B Shelar [Mon, 19 Aug 2013 18:23:17 +0000 (11:23 -0700)]
vxlan: Factor out vxlan send api.

Following patch allows more code sharing between vxlan and ovs-vxlan.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agovxlan: Extend vxlan handlers for openvswitch.
Pravin B Shelar [Mon, 19 Aug 2013 18:23:07 +0000 (11:23 -0700)]
vxlan: Extend vxlan handlers for openvswitch.

Following patch adds data field to vxlan socket and export
vxlan handler api.
vh->data is required to store private data per vxlan handler.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agovxlan: Add vxlan recv demux.
Pravin B Shelar [Mon, 19 Aug 2013 18:23:02 +0000 (11:23 -0700)]
vxlan: Add vxlan recv demux.

Once we have ovs-vxlan functionality, one UDP port can be assigned
to kernel-vxlan or ovs-vxlan port.  Therefore following patch adds
vxlan demux functionality, so that vxlan or ovs module can
register for particular port.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agovxlan: Restructure vxlan receive.
Pravin B Shelar [Mon, 19 Aug 2013 18:22:54 +0000 (11:22 -0700)]
vxlan: Restructure vxlan receive.

Use iptunnel_pull_header() for better code sharing.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agovxlan: Restructure vxlan socket apis.
Pravin B Shelar [Mon, 19 Aug 2013 18:22:48 +0000 (11:22 -0700)]
vxlan: Restructure vxlan socket apis.

Restructure vxlan-socket management APIs so that it can be
shared between vxlan and ovs modules.
This patch does not change any functionality.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
v6-v7:
 - get rid of zero refcnt vs from hashtable.
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agor8152: add comments
hayeswang [Fri, 16 Aug 2013 08:09:38 +0000 (16:09 +0800)]
r8152: add comments

Add comments.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agor8152: adjust tx_bottom function
hayeswang [Fri, 16 Aug 2013 08:09:37 +0000 (16:09 +0800)]
r8152: adjust tx_bottom function

Split some parts of code into another function to simplify
tx_bottom(). Use while loop to replace the goto loop.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agor8152: move some declearation of variables
hayeswang [Fri, 16 Aug 2013 08:09:36 +0000 (16:09 +0800)]
r8152: move some declearation of variables

Move some declearation of variables in rx_bottom().

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agor8152: adjust some duplicated code
hayeswang [Fri, 16 Aug 2013 08:09:35 +0000 (16:09 +0800)]
r8152: adjust some duplicated code

- Use r8152_get_tx_agg for getting tx agg list
- Replace submit rx with goto submit

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agor8152: replace lockflags with flags
hayeswang [Fri, 16 Aug 2013 08:09:34 +0000 (16:09 +0800)]
r8152: replace lockflags with flags

Replace lockflags with flags.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agor8152: replace void * with struct r8152 *
hayeswang [Fri, 16 Aug 2013 08:09:33 +0000 (16:09 +0800)]
r8152: replace void * with struct r8152 *

Change the type of contex of tx_agg and rx_agg from void * to
staruc r8152 *.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agor8152: remove clearing the memory to zero for netdev priv
hayeswang [Fri, 16 Aug 2013 08:09:32 +0000 (16:09 +0800)]
r8152: remove clearing the memory to zero for netdev priv

Remove memset(tp, 0, sizeof(*tp));

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agosis900: don't restart auto-negotiation each time after link resume.
Denis Kirjanov [Fri, 16 Aug 2013 07:20:07 +0000 (11:20 +0400)]
sis900: don't restart auto-negotiation each time after link resume.

Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Fri, 16 Aug 2013 22:37:26 +0000 (15:37 -0700)]
Merge git://git./linux/kernel/git/davem/net

11 years agoMerge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux
Linus Torvalds [Fri, 16 Aug 2013 17:00:18 +0000 (10:00 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux

Pull clock controller fixes from Michael Turquette:
 "Two small fixes for the Zynq clock controller introduced in 3.11-rc1
  and another Exynos clock patch which fixes a regression that prevents
  the video pipeline from functioning on that platform"

* tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux:
  clk: exynos4: Add CLK_GET_RATE_NOCACHE flag for the Exynos4x12 ISP clocks
  clk/zynq/clkc: Add CLK_SET_RATE_PARENT flag to ethernet muxes
  clk/zynq/clkc: Add dedicated spinlock for the SWDT

11 years agoMerge tag 'pm-3.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Fri, 16 Aug 2013 16:59:00 +0000 (09:59 -0700)]
Merge tag 'pm-3.11-rc6' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management fix from Rafael Wysocki:
 "The removal of delayed_work_pending() checks from kernel/power/qos.c
  done in 3.9 introduced a deadlock in pm_qos_work_fn().

  Fix from Stephen Boyd"

* tag 'pm-3.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / QoS: Fix workqueue deadlock when using pm_qos_update_request_timeout()

11 years agoMerge tag 'sound-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Linus Torvalds [Fri, 16 Aug 2013 16:58:21 +0000 (09:58 -0700)]
Merge tag 'sound-3.11' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "This batch contains a few USB audio fixes, a couple of HD-audio
  quirks, various small ASoC driver fixes in addition to an ASoC core
  fix that may lead to memory corruption.

  Unfortunately slightly more volume than the previous pull request, but
  all are reasonable regression fixes"

* tag 'sound-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - Add a fixup for Gateway LT27
  ASoC: tegra: fix Tegra30 I2S capture parameter setup
  ALSA: usb-audio: Fix invalid volume resolution for Logitech HD Webcam C525
  ALSA: hda - Fix missing mute controls for CX5051
  ALSA: usb-audio: fix automatic Roland/Yamaha MIDI detection
  ALSA: 6fire: make buffers DMA-able (midi)
  ALSA: 6fire: make buffers DMA-able (pcm)
  ALSA: hda - Add pinfix for LG LW25 laptop
  ASoC: cs42l52: Add new TLV for Beep Volume
  ASoC: cs42l52: Reorder Min/Max and update to SX_TLV for Beep Volume
  ASoC: dapm: Fix empty list check in dapm_new_mux()
  ASoC: sgtl5000: fix buggy 'Capture Attenuate Switch' control
  ASoC: sgtl5000: prevent playback to be muted when terminating concurrent capture

11 years agoMerge tag 'usb-3.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Fri, 16 Aug 2013 16:57:38 +0000 (09:57 -0700)]
Merge tag 'usb-3.11-rc6' of git://git./linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are some small USB fixes for 3.11-rc6 that have accumulated.

  Nothing huge, a EHCI fix that solves a much-reported audio USB
  problem, some usb-serial driver endian fixes and other minor fixes, a
  wireless USB oops fix, and two new quirks"

* tag 'usb-3.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: keyspan: fix null-deref at disconnect and release
  USB: mos7720: fix broken control requests
  usb: add two quirky touchscreen
  USB: ti_usb_3410_5052: fix big-endian firmware handling
  USB: adutux: fix big-endian device-type reporting
  USB: usbtmc: fix big-endian probe of Rigol devices
  USB: mos7840: fix big-endian probe
  USB-Serial: Fix error handling of usb_wwan
  wusbcore: fix kernel panic when disconnecting a wireless USB->serial device
  USB: EHCI: accept very late isochronous URBs

11 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Fri, 16 Aug 2013 16:35:29 +0000 (09:35 -0700)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Fix SKB leak in 8139cp, from Dave Jones.

 2) Fix use of *_PAGES interfaces with mlx5 firmware, from Moshe Lazar.

 3) RCU conversion of macvtap introduced two races, fixes by Eric
    Dumazet

 4) Synchronize statistic flows in bnx2x driver to prevent corruption,
    from Dmitry Kravkov

 5) Undo optimization in IP tunneling, we were using the inner IP header
    in some cases to inherit the IP ID, but that isn't correct in some
    circumstances.  From Pravin B Shelar

 6) Use correct struct size when parsing netlink attributes in
    rtnl_bridge_getlink().  From Asbjoern Sloth Toennesen

 7) Length verifications in tun_get_user() are bogus, from Weiping Pan
    and Dan Carpenter

 8) Fix bad merge resolution during 3.11 networking development in
    openvswitch, albeit a harmless one which added some unreachable
    code.  From Jesse Gross

 9) Wrong size used in flexible array allocation in openvswitch, from
    Pravin B Shelar

10) Clear out firmware capability flags the be2net driver isn't ready to
    handle yet, from Sarveshwar Bandi

11) Revert DMA mapping error checking addition to cxgb3 driver, it's
    buggy.  From Alexey Kardashevskiy

12) Fix regression in packet scheduler rate limiting when working with a
    link layer of ATM.  From Jesper Dangaard Brouer

13) Fix several errors in TCP Cubic congestion control, in particular
    overflow errors in timestamp calculations.  From Eric Dumazet and
    Van Jacobson

14) In ipv6 routing lookups, we need to backtrack if subtree traversal
    don't result in a match.  From Hannes Frederic Sowa

15) ipgre_header() returns incorrect packet offset.  Fix from Timo Teräs

16) Get "low latency" out of the new MIB counter names.  From Eliezer
    Tamir

17) State check in ndo_dflt_fdb_del() is inverted, from Sridhar
    Samudrala

18) Handle TCP Fast Open properly in netfilter conntrack, from Yuchung
    Cheng

19) Wrong memcpy length in pcan_usb driver, from Stephane Grosjean

20) Fix dealock in TIPC, from Wang Weidong and Ding Tianhong

21) call_rcu() call to destroy SCTP transport is done too early and
    might result in an oops.  From Daniel Borkmann

22) Fix races in genetlink family dumps, from Johannes Berg

23) Flags passed into macvlan by the user need to be validated properly,
    from Michael S Tsirkin

24) Fix skge build on 32-bit, from Stephen Hemminger

25) Handle malformed TCP headers properly in xt_TCPMSS, from Pablo Neira
    Ayuso

26) Fix handling of stacked vlans in vlan_dev_real_dev(), from Nikolay
    Aleksandrov

27) Eliminate MTU calculation overflows in esp{4,6}, from Daniel
    Borkmann

28) neigh_parms need to be setup before calling the ->ndo_neigh_setup()
    method.  From Veaceslav Falico

29) Kill out-of-bounds prefetch in fib_trie, from Eric Dumazet

30) Don't dereference MLD query message if the length isn't value in the
    bridge multicast code, from Linus Lüssing

31) Fix VXLAN IGMP join regression due to an inverted check, from Cong
    Wang

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (70 commits)
  net/mlx5_core: Support MANAGE_PAGES and QUERY_PAGES firmware command changes
  tun: signedness bug in tun_get_user()
  qlcnic: Fix diagnostic interrupt test for 83xx adapters
  qlcnic: Fix beacon state return status handling
  qlcnic: Fix set driver version command
  net: tg3: fix NULL pointer dereference in tg3_io_error_detected and tg3_io_slot_reset
  net_sched: restore "linklayer atm" handling
  drivers/net/ethernet/via/via-velocity.c: update napi implementation
  Revert "cxgb3: Check and handle the dma mapping errors"
  be2net: Clear any capability flags that driver is not interested in.
  openvswitch: Reset tunnel key between input and output.
  openvswitch: Use correct type while allocating flex array.
  openvswitch: Fix bad merge resolution.
  tun: compare with 0 instead of total_len
  rtnetlink: rtnl_bridge_getlink: Call nlmsg_find_attr() with ifinfomsg header
  ethernet/arc/arc_emac - fix NAPI "work > weight" warning
  ip_tunnel: Do not use inner ip-header-id for tunnel ip-header-id.
  bnx2x: prevent crash in shutdown flow with CNIC
  bnx2x: fix PTE write access error
  bnx2x: fix memory leak in VF
  ...

11 years agoFix TLB gather virtual address range invalidation corner cases
Linus Torvalds [Thu, 15 Aug 2013 18:42:25 +0000 (11:42 -0700)]
Fix TLB gather virtual address range invalidation corner cases

Ben Tebulin reported:

 "Since v3.7.2 on two independent machines a very specific Git
  repository fails in 9/10 cases on git-fsck due to an SHA1/memory
  failures.  This only occurs on a very specific repository and can be
  reproduced stably on two independent laptops.  Git mailing list ran
  out of ideas and for me this looks like some very exotic kernel issue"

and bisected the failure to the backport of commit 53a59fc67f97 ("mm:
limit mmu_gather batching to fix soft lockups on !CONFIG_PREEMPT").

That commit itself is not actually buggy, but what it does is to make it
much more likely to hit the partial TLB invalidation case, since it
introduces a new case in tlb_next_batch() that previously only ever
happened when running out of memory.

The real bug is that the TLB gather virtual memory range setup is subtly
buggered.  It was introduced in commit 597e1c3580b7 ("mm/mmu_gather:
enable tlb flush range in generic mmu_gather"), and the range handling
was already fixed at least once in commit e6c495a96ce0 ("mm: fix the TLB
range flushed when __tlb_remove_page() runs out of slots"), but that fix
was not complete.

The problem with the TLB gather virtual address range is that it isn't
set up by the initial tlb_gather_mmu() initialization (which didn't get
the TLB range information), but it is set up ad-hoc later by the
functions that actually flush the TLB.  And so any such case that forgot
to update the TLB range entries would potentially miss TLB invalidates.

Rather than try to figure out exactly which particular ad-hoc range
setup was missing (I personally suspect it's the hugetlb case in
zap_huge_pmd(), which didn't have the same logic as zap_pte_range()
did), this patch just gets rid of the problem at the source: make the
TLB range information available to tlb_gather_mmu(), and initialize it
when initializing all the other tlb gather fields.

This makes the patch larger, but conceptually much simpler.  And the end
result is much more understandable; even if you want to play games with
partial ranges when invalidating the TLB contents in chunks, now the
range information is always there, and anybody who doesn't want to
bother with it won't introduce subtle bugs.

Ben verified that this fixes his problem.

Reported-bisected-and-tested-by: Ben Tebulin <tebulin@googlemail.com>
Build-testing-by: Stephen Rothwell <sfr@canb.auug.org.au>
Build-testing-by: Richard Weinberger <richard.weinberger@gmail.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoALSA: hda - Add a fixup for Gateway LT27
Takashi Iwai [Fri, 16 Aug 2013 06:17:05 +0000 (08:17 +0200)]
ALSA: hda - Add a fixup for Gateway LT27

Gateway LT27 needs a fixup for the inverted digital mic.

Reported-by: "Nathanael D. Noblet" <nathanael@gnat.ca>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
11 years agonetlink: Eliminate kmalloc in netlink dump operation.
Pravin B Shelar [Thu, 15 Aug 2013 22:31:06 +0000 (15:31 -0700)]
netlink: Eliminate kmalloc in netlink dump operation.

Following patch stores struct netlink_callback in netlink_sock
to avoid allocating and freeing it on every netlink dump msg.
Only one dump operation is allowed for a given socket at a time
therefore we can safely convert cb pointer to cb struct inside
netlink_sock.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet/mlx5_core: Support MANAGE_PAGES and QUERY_PAGES firmware command changes
Moshe Lazer [Wed, 14 Aug 2013 14:46:48 +0000 (17:46 +0300)]
net/mlx5_core: Support MANAGE_PAGES and QUERY_PAGES firmware command changes

In the previous QUERY_PAGES command version we used one command to get the
required amount of boot, init and post init pages.  The new version uses the
op_mod field to specify whether the query is for the required amount of boot,
init or post init pages. In addition the output field size for the required
amount of pages increased from 16 to 32 bits.

In MANAGE_PAGES command the input_num_entries and output_num_entries fields
sizes changed from 16 to 32 bits and the PAS tables offset changed to 0x10.

In the pages request event the num_pages field also changed to 32 bits.

In the HCA-capabilities-layout the size and location of max_qp_mcg field has
been changed to support 24 bits.

This patch isn't compatible with firmware versions < 5; however, it  turns out that the
first GA firmware we will publish will not support previous versions so this should be OK.

Signed-off-by: Moshe Lazer <moshel@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agotun: signedness bug in tun_get_user()
Dan Carpenter [Thu, 15 Aug 2013 12:52:57 +0000 (15:52 +0300)]
tun: signedness bug in tun_get_user()

The recent fix d9bf5f1309 "tun: compare with 0 instead of total_len" is
not totally correct.  Because "len" and "sizeof()" are size_t type, that
means they are never less than zero.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoqlcnic: Update version to 5.2.46.
Sucheta Chakraborty [Thu, 15 Aug 2013 12:27:29 +0000 (08:27 -0400)]
qlcnic: Update version to 5.2.46.

Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoqlcnic: Dump mailbox command data when a command times out
Manish Chopra [Thu, 15 Aug 2013 12:27:28 +0000 (08:27 -0400)]
qlcnic: Dump mailbox command data when a command times out

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoqlcnic: Fix driver initialization for 83xx adapters
Manish Chopra [Thu, 15 Aug 2013 12:27:27 +0000 (08:27 -0400)]
qlcnic: Fix driver initialization for 83xx adapters

o Load firmware from file before setting up interrupts.

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoqlcnic: Flush mailbox command list when mailbox is not available
Manish Chopra [Thu, 15 Aug 2013 12:27:26 +0000 (08:27 -0400)]
qlcnic: Flush mailbox command list when mailbox is not available

o Driver was hitting a panic at the time of adapter reset due to invalid command
  access from the list which had been already freed by the queuing thread.
  Flush all the pending commands from the list before proceeding with adapter reset

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoqlcnic: Reinitialize mailbox data structures after firmware reset
Manish Chopra [Thu, 15 Aug 2013 12:27:25 +0000 (08:27 -0400)]
qlcnic: Reinitialize mailbox data structures after firmware reset

o After firmware reset VFs were failing to come up because of not
  reinitializing mailbox data structures. Reinitialize them so that
  VFs can come up after firmware reset.

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: proc_fs: trivial: print UIDs as unsigned int
Francesco Fusco [Thu, 15 Aug 2013 11:42:14 +0000 (13:42 +0200)]
net: proc_fs: trivial: print UIDs as unsigned int

UIDs are printed in the proc_fs as signed int, whereas
they are unsigned int.

Signed-off-by: Francesco Fusco <ffusco@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoqlcnic: Fix diagnostic interrupt test for 83xx adapters
Manish Chopra [Thu, 15 Aug 2013 12:29:29 +0000 (08:29 -0400)]
qlcnic: Fix diagnostic interrupt test for 83xx adapters

o Do not allow interrupt test when adapter is resetting.

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoqlcnic: Fix beacon state return status handling
Sucheta Chakraborty [Thu, 15 Aug 2013 12:29:28 +0000 (08:29 -0400)]
qlcnic: Fix beacon state return status handling

o Driver was misinterpreting the return status for beacon
  state query leading to incorrect interpretation of beacon
  state and logging an error message for successful status.
  Fixed the driver to properly interpret the return status.

Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoqlcnic: Fix set driver version command
Himanshu Madhani [Thu, 15 Aug 2013 12:29:27 +0000 (08:29 -0400)]
qlcnic: Fix set driver version command

Driver was issuing set driver version command through all
functions in the adapter. Fix the driver to issue set driver
version once per adapter, through function 0.

Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: tg3: fix NULL pointer dereference in tg3_io_error_detected and tg3_io_slot_reset
Daniel Borkmann [Tue, 13 Aug 2013 18:45:13 +0000 (11:45 -0700)]
net: tg3: fix NULL pointer dereference in tg3_io_error_detected and tg3_io_slot_reset

Commit d8af4dfd8 ("net/tg3: Fix kernel crash") introduced a possible
NULL pointer dereference in tg3 driver when !netdev || !netif_running(netdev)
condition is met and netdev is NULL. Then, the jump to the 'done' label
calls dev_close() with a netdevice that is NULL. Therefore, only call
dev_close() when we have a netdevice, but one that is not running.

[ Add the same checks in tg3_io_slot_reset() per Gavin Shan - by Nithin
Nayak Sujir ]

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Gavin Shan <shangw@linux.vnet.ibm.com>
Cc: Michael Chan <mchan@broadcom.com>
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge tag 'asoc-v3.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie...
Takashi Iwai [Thu, 15 Aug 2013 18:43:46 +0000 (20:43 +0200)]
Merge tag 'asoc-v3.11-rc5' of git://git./linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v3.11

A few driver specific fixes here plus one core fix for a memory
corruption issue in DAPM initialisation which could lead to crashes.

11 years agoMerge remote-tracking branch 'asoc/fix/tegra' into asoc-linus
Mark Brown [Thu, 15 Aug 2013 10:37:54 +0000 (11:37 +0100)]
Merge remote-tracking branch 'asoc/fix/tegra' into asoc-linus

11 years agoMerge remote-tracking branch 'asoc/fix/sgtl5000' into asoc-linus
Mark Brown [Thu, 15 Aug 2013 10:37:53 +0000 (11:37 +0100)]
Merge remote-tracking branch 'asoc/fix/sgtl5000' into asoc-linus

11 years agoMerge remote-tracking branch 'asoc/fix/dapm' into asoc-linus
Mark Brown [Thu, 15 Aug 2013 10:37:53 +0000 (11:37 +0100)]
Merge remote-tracking branch 'asoc/fix/dapm' into asoc-linus

11 years agoMerge remote-tracking branch 'asoc/fix/cs42l52' into asoc-linus
Mark Brown [Thu, 15 Aug 2013 10:37:52 +0000 (11:37 +0100)]
Merge remote-tracking branch 'asoc/fix/cs42l52' into asoc-linus

11 years agoASoC: tegra: fix Tegra30 I2S capture parameter setup
Stephen Warren [Wed, 14 Aug 2013 20:24:16 +0000 (14:24 -0600)]
ASoC: tegra: fix Tegra30 I2S capture parameter setup

The Tegra30 I2S driver was writing the AHUB interface parameters to the
playback path register rather than the capture path register. This
caused the capture parameters not to be configured at all, so if
capturing using non-HW-default parameters (e.g. 16-bit stereo rather
than 8-bit mono) the audio would be corrupted.

With this fixed, audio capture from an analog microphone works correctly
on the Cardhu board.

Cc: stable@vger.kernel.org
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
11 years agonet_sched: restore "linklayer atm" handling
Jesper Dangaard Brouer [Wed, 14 Aug 2013 21:47:11 +0000 (23:47 +0200)]
net_sched: restore "linklayer atm" handling

commit 56b765b79 ("htb: improved accuracy at high rates")
broke the "linklayer atm" handling.

 tc class add ... htb rate X ceil Y linklayer atm

The linklayer setting is implemented by modifying the rate table
which is send to the kernel.  No direct parameter were
transferred to the kernel indicating the linklayer setting.

The commit 56b765b79 ("htb: improved accuracy at high rates")
removed the use of the rate table system.

To keep compatible with older iproute2 utils, this patch detects
the linklayer by parsing the rate table.  It also supports future
versions of iproute2 to send this linklayer parameter to the
kernel directly. This is done by using the __reserved field in
struct tc_ratespec, to convey the choosen linklayer option, but
only using the lower 4 bits of this field.

Linklayer detection is limited to speeds below 100Mbit/s, because
at high rates the rtab is gets too inaccurate, so bad that
several fields contain the same values, this resembling the ATM
detect.  Fields even start to contain "0" time to send, e.g. at
1000Mbit/s sending a 96 bytes packet cost "0", thus the rtab have
been more broken than we first realized.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch
David S. Miller [Thu, 15 Aug 2013 08:41:10 +0000 (01:41 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/jesse/openvswitch

Jesse Gross says:

====================
Three bug fixes that are fairly small either way but resolve obviously
incorrect code. For net/3.11.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agodrivers/net/ethernet/via/via-velocity.c: update napi implementation
Julia Lawall [Wed, 14 Aug 2013 14:26:53 +0000 (16:26 +0200)]
drivers/net/ethernet/via/via-velocity.c: update napi implementation

Drivers supporting NAPI should use a NAPI-specific function for receiving
packets.  Hence netif_rx is changed to netif_receive_skb.

Furthermore netif_napi_del should be used in the probe and remove function
to clean up the NAPI resource information.

Thanks to Francois Romieu, David Shwatrz and Rami Rosen for their help on
this patch.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet/usb/r8152: enable interrupt transfer
hayeswang [Wed, 14 Aug 2013 12:54:40 +0000 (20:54 +0800)]
net/usb/r8152: enable interrupt transfer

Use the interrupt transfer to replace polling link status.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet/usb/r8152: enable tx checksum
hayeswang [Wed, 14 Aug 2013 12:54:39 +0000 (20:54 +0800)]
net/usb/r8152: enable tx checksum

Enable tx checksum.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet/usb/r8152: support aggregation
hayeswang [Wed, 14 Aug 2013 12:54:38 +0000 (20:54 +0800)]
net/usb/r8152: support aggregation

Enable the tx/rx aggregation which could contain one or more packets
for each bulk in/out. This could reduce the loading of the host
controller by sending less bulk transfer.

The rx packets in the bulk in buffer should be 8-byte aligned, and
the tx packets in the bulk out buffer should be 4-byte aligned.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agortnetlink: remove an unneeded test
Dan Carpenter [Wed, 14 Aug 2013 09:35:42 +0000 (12:35 +0300)]
rtnetlink: remove an unneeded test

We know that "dev" is a valid pointer at this point, so we can remove
the test and clean up a little.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoRevert "cxgb3: Check and handle the dma mapping errors"
Alexey Kardashevskiy [Wed, 14 Aug 2013 09:19:01 +0000 (19:19 +1000)]
Revert "cxgb3: Check and handle the dma mapping errors"

This reverts commit f83331bab149e29fa2c49cf102c0cd8c3f1ce9f9.

As the tests PPC64 (powernv platform) show, IOMMU pages are leaking
when transferring big amount of small packets (<=64 bytes),
"ping -f" and waiting for 15 seconds is the simplest way to confirm the bug.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Santosh Rastapur <santosh@chelsio.com>
Cc: Jay Fenlason <fenlason@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Divy Le ray <divy@chelsio.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobe2net: Clear any capability flags that driver is not interested in.
Sarveshwar Bandi [Wed, 14 Aug 2013 07:51:47 +0000 (13:21 +0530)]
be2net: Clear any capability flags that driver is not interested in.

It is possible for some versions of firmware to advertise capabilities that driver
is not ready to handle. This may lead to controller stall. Since the driver is
interested only in subset of flags, clearing the rest.

Signed-off-by: Sarveshwar Bandi <sarveshwar.bandi@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'x-netns'
David S. Miller [Thu, 15 Aug 2013 08:01:27 +0000 (01:01 -0700)]
Merge branch 'x-netns'

Nicolas Dichtel says:

====================
This series is a follow up of the previous series whcih adds this
functionality for sit tunnels.

The goal is to add x-netns support for the module ipip and ip6_tunnel,
ie. the encapsulation addresses and the network device are not owned
by the same namespace.

Note that the two first patches are cleanups.

Example to configure an ipip tunnel:

modprobe ipip
ip netns add netns1
ip link add ipip1 type ipip remote 10.16.0.121 local 10.16.0.249
ip l s ipip1 netns netns1
ip netns exec netns1 ip l s lo up
ip netns exec netns1 ip l s ipip1 up
ip netns exec netns1 ip a a dev ipip1 192.168.2.123 remote 192.168.2.121

or an ip6_tunnel:

modprobe ip6_tunnel
ip netns add netns1
ip link add ip6tnl1 type ip6tnl remote 2001:660:3008:c1c3::121 local 2001:660:3008:c1c3::123
ip l s ip6tnl1 netns netns1
ip netns exec netns1 ip l s lo up
ip netns exec netns1 ip l s ip6tnl1 up
ip netns exec netns1 ip a a dev ip6tnl1 192.168.1.123 remote 192.168.1.121
ip netns exec netns1 ip -6 a a dev ip6tnl1 2001:1235::123 remote 2001:1235::121

v2: remove the patch 1/3 of the v1 series (already included)
    use net_eq()
    add patch 1/4 and 2/4
====================-

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoip6tnl: add x-netns support
Nicolas Dichtel [Tue, 13 Aug 2013 15:51:12 +0000 (17:51 +0200)]
ip6tnl: add x-netns support

This patch allows to switch the netns when packet is encapsulated or
decapsulated. In other word, the encapsulated packet is received in a netns,
where the lookup is done to find the tunnel. Once the tunnel is found, the
packet is decapsulated and injecting into the corresponding interface which
stands to another netns.

When one of the two netns is removed, the tunnel is destroyed.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipip: add x-netns support
Nicolas Dichtel [Tue, 13 Aug 2013 15:51:11 +0000 (17:51 +0200)]
ipip: add x-netns support

This patch allows to switch the netns when packet is encapsulated or
decapsulated. In other word, the encapsulated packet is received in a netns,
where the lookup is done to find the tunnel. Once the tunnel is found, the
packet is decapsulated and injecting into the corresponding interface which
stands to another netns.

When one of the two netns is removed, the tunnel is destroyed.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv4 tunnels: use net_eq() helper to check netns
Nicolas Dichtel [Tue, 13 Aug 2013 15:51:10 +0000 (17:51 +0200)]
ipv4 tunnels: use net_eq() helper to check netns

It's better to use available helpers for these tests.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agodev: move skb_scrub_packet() after eth_type_trans()
Nicolas Dichtel [Tue, 13 Aug 2013 15:51:09 +0000 (17:51 +0200)]
dev: move skb_scrub_packet() after eth_type_trans()

skb_scrub_packet() was called before eth_type_trans() to let eth_type_trans()
set pkt_type.

In fact, we should force pkt_type to PACKET_HOST, so move the call after
eth_type_trans().

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoopenvswitch: Reset tunnel key between input and output.
Jesse Gross [Wed, 14 Aug 2013 22:50:36 +0000 (15:50 -0700)]
openvswitch: Reset tunnel key between input and output.

It doesn't make sense to output a tunnel packet using the same
parameters that it was received with since that will generally
just result in the packet going back to us. As a result, userspace
assumes that the tunnel key is cleared when transitioning through
the switch. In the majority of cases this doesn't matter since a
packet is either going to a tunnel port (in which the key is
overwritten with new values) or to a non-tunnel port (in which
case the key is ignored). However, it's theoreticaly possible that
userspace could rely on the documented behavior, so this corrects
it.

Signed-off-by: Jesse Gross <jesse@nicira.com>
11 years agoopenvswitch: Use correct type while allocating flex array.
Pravin B Shelar [Tue, 30 Jul 2013 22:44:14 +0000 (15:44 -0700)]
openvswitch: Use correct type while allocating flex array.

Flex array is used to allocate hash buckets which is type struct
hlist_head, but we use `struct hlist_head *` to calculate
array size.  Since hlist_head is of size pointer it works fine.

Following patch use correct type.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
11 years agoopenvswitch: Fix bad merge resolution.
Jesse Gross [Mon, 13 May 2013 15:41:06 +0000 (08:41 -0700)]
openvswitch: Fix bad merge resolution.

git silently included an extra hunk in vport_cmd_set() during
automatic merging. This code is unreachable so it does not actually
introduce a problem but it is clearly incorrect.

Signed-off-by: Jesse Gross <jesse@nicira.com>
11 years agoUSB: keyspan: fix null-deref at disconnect and release
Johan Hovold [Tue, 13 Aug 2013 11:27:35 +0000 (13:27 +0200)]
USB: keyspan: fix null-deref at disconnect and release

Make sure to fail properly if the device is not accepted during attach
in order to avoid null-pointer derefs (of missing interface private
data) at disconnect or release.

Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoUSB: mos7720: fix broken control requests
Johan Hovold [Tue, 13 Aug 2013 11:27:34 +0000 (13:27 +0200)]
USB: mos7720: fix broken control requests

The parallel-port code of the drivers used a stack allocated
control-request buffer for asynchronous (and possibly deferred) control
requests. This not only violates the no-DMA-from-stack requirement but
could also lead to corrupt control requests being submitted.

Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agousb: add two quirky touchscreen
Oliver Neukum [Wed, 14 Aug 2013 09:01:46 +0000 (11:01 +0200)]
usb: add two quirky touchscreen

These devices tend to become unresponsive after S3

Signed-off-by: Oliver Neukum <oneukum@suse.de>
CC: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoMerge branch 'akpm' (patches from Andrew Morton)
Linus Torvalds [Wed, 14 Aug 2013 17:04:43 +0000 (10:04 -0700)]
Merge branch 'akpm' (patches from Andrew Morton)

Merge a bunch of fixes from Andrew Morton.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  fs/proc/task_mmu.c: fix buffer overflow in add_page_map()
  arch: *: Kconfig: add "kernel/Kconfig.freezer" to "arch/*/Kconfig"
  ocfs2: fix null pointer dereference in ocfs2_dir_foreach_blk_id()
  x86 get_unmapped_area(): use proper mmap base for bottom-up direction
  ocfs2: fix NULL pointer dereference in ocfs2_duplicate_clusters_by_page
  ocfs2: Revert 40bd62e to avoid regression in extended allocation
  drivers/rtc/rtc-stmp3xxx.c: provide timeout for potentially endless loop polling a HW bit
  hugetlb: fix lockdep splat caused by pmd sharing
  aoe: adjust ref of head for compound page tails
  microblaze: fix clone syscall
  mm: save soft-dirty bits on file pages
  mm: save soft-dirty bits on swapped pages
  memcg: don't initialize kmem-cache destroying work for root caches

11 years agotun: compare with 0 instead of total_len
Weiping Pan [Tue, 13 Aug 2013 13:46:56 +0000 (21:46 +0800)]
tun: compare with 0 instead of total_len

Since we set "len = total_len" in the beginning of tun_get_user(),
so we should compare the new len with 0, instead of total_len,
or the if statement always returns false.

Signed-off-by: Weiping Pan <wpan@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agortnetlink: rtnl_bridge_getlink: Call nlmsg_find_attr() with ifinfomsg header
Asbjoern Sloth Toennesen [Mon, 12 Aug 2013 16:30:09 +0000 (16:30 +0000)]
rtnetlink: rtnl_bridge_getlink: Call nlmsg_find_attr() with ifinfomsg header

Fix the iproute2 command `bridge vlan show`, after switching from
rtgenmsg to ifinfomsg.

Let's start with a little history:

Feb 20:   Vlad Yasevich got his VLAN-aware bridge patchset included in
          the 3.9 merge window.
          In the kernel commit 6cbdceeb, he added attribute support to
          bridge GETLINK requests sent with rtgenmsg.

Mar 6th:  Vlad got this iproute2 reference implementation of the bridge
          vlan netlink interface accepted (iproute2 9eff0e5c)

Apr 25th: iproute2 switched from using rtgenmsg to ifinfomsg (63338dca)
          http://patchwork.ozlabs.org/patch/239602/
          http://marc.info/?t=136680900700007

Apr 28th: Linus released 3.9

Apr 30th: Stephen released iproute2 3.9.0

The `bridge vlan show` command haven't been working since the switch to
ifinfomsg, or in a released version of iproute2. Since the kernel side
only supports rtgenmsg, which iproute2 switched away from just prior to
the iproute2 3.9.0 release.

I haven't been able to find any documentation, about neither rtgenmsg
nor ifinfomsg, and in which situation to use which, but kernel commit
88c5b5ce seams to suggest that ifinfomsg should be used.

Fixing this in kernel will break compatibility, but I doubt that anybody
have been using it due to this bug in the user space reference
implementation, at least not without noticing this bug. That said the
functionality is still fully functional in 3.9, when reversing iproute2
commit 63338dca.

This could also be fixed in iproute2, but thats an ugly patch that would
reintroduce rtgenmsg in iproute2, and from searching in netdev it seams
like rtgenmsg usage is discouraged. I'm assuming that the only reason
that Vlad implemented the kernel side to use rtgenmsg, was because
iproute2 was using it at the time.

Signed-off-by: Asbjoern Sloth Toennesen <ast@fiberby.net>
Reviewed-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agofs/proc/task_mmu.c: fix buffer overflow in add_page_map()
yonghua zheng [Tue, 13 Aug 2013 23:01:03 +0000 (16:01 -0700)]
fs/proc/task_mmu.c: fix buffer overflow in add_page_map()

Recently we met quite a lot of random kernel panic issues after enabling
CONFIG_PROC_PAGE_MONITOR.  After debuggind we found this has something
to do with following bug in pagemap:

In struct pagemapread:

  struct pagemapread {
      int pos, len;
      pagemap_entry_t *buffer;
      bool v2;
  };

pos is number of PM_ENTRY_BYTES in buffer, but len is the size of
buffer, it is a mistake to compare pos and len in add_page_map() for
checking buffer is full or not, and this can lead to buffer overflow and
random kernel panic issue.

Correct len to be total number of PM_ENTRY_BYTES in buffer.

[akpm@linux-foundation.org: document pagemapread.pos and .len units, fix PM_ENTRY_BYTES definition]
Signed-off-by: Yonghua Zheng <younghua.zheng@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoarch: *: Kconfig: add "kernel/Kconfig.freezer" to "arch/*/Kconfig"
Chen Gang [Tue, 13 Aug 2013 23:01:02 +0000 (16:01 -0700)]
arch: *: Kconfig: add "kernel/Kconfig.freezer" to "arch/*/Kconfig"

All architectures include "kernel/Kconfig.freezer" except three left, so
let them include it too, or 'allmodconfig' will report error.

The related errors: (with allmodconfig for openrisc):

    CC      kernel/cgroup_freezer.o
  kernel/cgroup_freezer.c: In function 'freezer_css_online':
  kernel/cgroup_freezer.c:133:15: error: 'system_freezing_cnt' undeclared (first use in this function)
  kernel/cgroup_freezer.c:133:15: note: each undeclared identifier is reported only once for each function it appears in
  kernel/cgroup_freezer.c: In function 'freezer_css_offline':
  kernel/cgroup_freezer.c:157:15: error: 'system_freezing_cnt' undeclared (first use in this function)
  kernel/cgroup_freezer.c: In function 'freezer_attach':
  kernel/cgroup_freezer.c:200:4: error: implicit declaration of function 'freeze_task'
  kernel/cgroup_freezer.c: In function 'freezer_apply_state':
  kernel/cgroup_freezer.c:371:16: error: 'system_freezing_cnt' undeclared (first use in this function)

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoocfs2: fix null pointer dereference in ocfs2_dir_foreach_blk_id()
Jeff Liu [Tue, 13 Aug 2013 23:01:01 +0000 (16:01 -0700)]
ocfs2: fix null pointer dereference in ocfs2_dir_foreach_blk_id()

Fix a NULL pointer deference while removing an empty directory, which
was introduced by commit 3704412bdbf3 ("[readdir] convert ocfs2").

  BUG: unable to handle kernel NULL pointer dereference at (null)
  IP: [<(null)>]           (null)
  PGD 6da85067 PUD 6da89067 PMD 0
  Oops: 0010 [#1] SMP
  CPU: 0 PID: 6564 Comm: rmdir Tainted: G           O 3.11.0-rc1 #4
  RIP: 0010:[<0000000000000000>]  [<          (null)>]           (null)
  Call Trace:
    ocfs2_dir_foreach+0x49/0x50 [ocfs2]
    ocfs2_empty_dir+0x12c/0x3e0 [ocfs2]
    ocfs2_unlink+0x56e/0xc10 [ocfs2]
    vfs_rmdir+0xd5/0x140
    do_rmdir+0x1cb/0x1e0
    SyS_rmdir+0x16/0x20
    system_call_fastpath+0x16/0x1b
  Code:  Bad RIP value.
  RIP  [<          (null)>]           (null)
  RSP <ffff88006daddc10>
  CR2: 0000000000000000

[dan.carpenter@oracle.com: fix pointer math]
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reported-by: David Weber <wb@munzinger.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agox86 get_unmapped_area(): use proper mmap base for bottom-up direction
Radu Caragea [Tue, 13 Aug 2013 23:00:59 +0000 (16:00 -0700)]
x86 get_unmapped_area(): use proper mmap base for bottom-up direction

When the stack is set to unlimited, the bottomup direction is used for
mmap-ings but the mmap_base is not used and thus effectively renders
ASLR for mmapings along with PIE useless.

Cc: Michel Lespinasse <walken@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Sendroiu <molecula2788@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoocfs2: fix NULL pointer dereference in ocfs2_duplicate_clusters_by_page
Tiger Yang [Tue, 13 Aug 2013 23:00:58 +0000 (16:00 -0700)]
ocfs2: fix NULL pointer dereference in ocfs2_duplicate_clusters_by_page

Since ocfs2_cow_file_pos will invoke ocfs2_refcount_icow with a NULL as
the struct file pointer, it finally result in a null pointer dereference
in ocfs2_duplicate_clusters_by_page.

This patch replace file pointer with inode pointer in
cow_duplicate_clusters to fix this issue.

[jeff.liu@oracle.com: rebased patch against linux-next tree]
Signed-off-by: Tiger Yang <tiger.yang@oracle.com>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Acked-by: Tao Ma <tm@tao.ma>
Tested-by: David Weber <wb@munzinger.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoocfs2: Revert 40bd62e to avoid regression in extended allocation
Jie Liu [Tue, 13 Aug 2013 23:00:57 +0000 (16:00 -0700)]
ocfs2: Revert 40bd62e to avoid regression in extended allocation

Revert commit 40bd62eb7fb8 ("fs/ocfs2/journal.h: add bits_wanted while
calculating credits in ocfs2_calc_extend_credits").

Unfortunately this change broke fallocate even if there is insufficient
disk space for the preallocation, which is a serious problem.

  # df -h
  /dev/sda8        22G  1.2G   21G   6% /ocfs2
  # fallocate -o 0 -l 200M /ocfs2/testfile
  fallocate: /ocfs2/test: fallocate failed: No space left on device

and a kernel warning:

  CPU: 3 PID: 3656 Comm: fallocate Tainted: G        W  O 3.11.0-rc3 #2
  Call Trace:
    dump_stack+0x77/0x9e
    warn_slowpath_common+0xc4/0x110
    warn_slowpath_null+0x2a/0x40
    start_this_handle+0x6c/0x640 [jbd2]
    jbd2__journal_start+0x138/0x300 [jbd2]
    jbd2_journal_start+0x23/0x30 [jbd2]
    ocfs2_start_trans+0x166/0x300 [ocfs2]
    __ocfs2_extend_allocation+0x38f/0xdb0 [ocfs2]
    ocfs2_allocate_unwritten_extents+0x3c9/0x520
    __ocfs2_change_file_space+0x5e0/0xa60 [ocfs2]
    ocfs2_fallocate+0xb1/0xe0 [ocfs2]
    do_fallocate+0x1cb/0x220
    SyS_fallocate+0x6f/0xb0
    system_call_fastpath+0x16/0x1b
  JBD2: fallocate wants too many credits (51216 > 4381)

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Cc: Goldwyn Rodrigues <rgoldwyn@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agodrivers/rtc/rtc-stmp3xxx.c: provide timeout for potentially endless loop polling...
Lothar Waßmann [Tue, 13 Aug 2013 23:00:56 +0000 (16:00 -0700)]
drivers/rtc/rtc-stmp3xxx.c: provide timeout for potentially endless loop polling a HW bit

It's always a bad idea to poll on HW bits without a timeout.

The i.MX28 RTC can be easily brought into a state in which the RTC is
not running (until after a power-on-reset) and thus the status bits
which are polled in the driver won't ever change.

This patch prevents the kernel from getting stuck in this case.

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
Acked-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agohugetlb: fix lockdep splat caused by pmd sharing
Michal Hocko [Tue, 13 Aug 2013 23:00:55 +0000 (16:00 -0700)]
hugetlb: fix lockdep splat caused by pmd sharing

Dave has reported the following lockdep splat:

  =================================
  [ INFO: inconsistent lock state ]
  3.11.0-rc1+ #9 Not tainted
  ---------------------------------
  inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
  kswapd0/49 [HC0[0]:SC0[0]:HE1:SE1] takes:
   (&mapping->i_mmap_mutex){+.+.?.}, at: [<c114971b>] page_referenced+0x87/0x5e3
  {RECLAIM_FS-ON-W} state was registered at:
     mark_held_locks+0x81/0xe7
     lockdep_trace_alloc+0x5e/0xbc
     __alloc_pages_nodemask+0x8b/0x9b6
     __get_free_pages+0x20/0x31
     get_zeroed_page+0x12/0x14
     __pmd_alloc+0x1c/0x6b
     huge_pmd_share+0x265/0x283
     huge_pte_alloc+0x5d/0x71
     hugetlb_fault+0x7c/0x64a
     handle_mm_fault+0x255/0x299
     __do_page_fault+0x142/0x55c
     do_page_fault+0xd/0x16
     error_code+0x6c/0x74
  irq event stamp: 3136917
  hardirqs last  enabled at (3136917):  _raw_spin_unlock_irq+0x27/0x50
  hardirqs last disabled at (3136916):  _raw_spin_lock_irq+0x15/0x78
  softirqs last  enabled at (3136180):  __do_softirq+0x137/0x30f
  softirqs last disabled at (3136175):  irq_exit+0xa8/0xaa
  other info that might help us debug this:
   Possible unsafe locking scenario:
         CPU0
         ----
    lock(&mapping->i_mmap_mutex);
    <Interrupt>
      lock(&mapping->i_mmap_mutex);

  *** DEADLOCK ***
  no locks held by kswapd0/49.

  stack backtrace:
  CPU: 1 PID: 49 Comm: kswapd0 Not tainted 3.11.0-rc1+ #9
  Hardware name: Dell Inc.                 Precision WorkStation 490    /0DT031, BIOS A08 04/25/2008
  Call Trace:
    dump_stack+0x4b/0x79
    print_usage_bug+0x1d9/0x1e3
    mark_lock+0x1e0/0x261
    __lock_acquire+0x623/0x17f2
    lock_acquire+0x7d/0x195
    mutex_lock_nested+0x6c/0x3a7
    page_referenced+0x87/0x5e3
    shrink_page_list+0x3d9/0x947
    shrink_inactive_list+0x155/0x4cb
    shrink_lruvec+0x300/0x5ce
    shrink_zone+0x53/0x14e
    kswapd+0x517/0xa75
    kthread+0xa8/0xaa
    ret_from_kernel_thread+0x1b/0x28

which is a false positive caused by hugetlb pmd sharing code which
allocates a new pmd from withing mapping->i_mmap_mutex.  If this
allocation causes reclaim then the lockdep detector complains that we
might self-deadlock.

This is not correct though, because hugetlb pages are not reclaimable so
their mapping will be never touched from the reclaim path.

The patch tells lockup detector that hugetlb i_mmap_mutex is special by
assigning it a separate lockdep class so it won't report possible
deadlocks on unrelated mappings.

[peterz@infradead.org: comment for annotation]
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoaoe: adjust ref of head for compound page tails
Ed Cashin [Tue, 13 Aug 2013 23:00:53 +0000 (16:00 -0700)]
aoe: adjust ref of head for compound page tails

Fix a BUG which can trigger when direct-IO is used with AOE.

As discussed previously, the fact that some users of the block layer
provide bios that point to pages with a zero _count means that it is not
OK for the network layer to do a put_page on the skb frags during an
skb_linearize, so the aoe driver gets a reference to pages in bios and
puts the reference before ending the bio.  And because it cannot use
get_page on a page with a zero _count, it manipulates the value
directly.

It is not OK to increment the _count of a compound page tail, though,
since the VM layer will VM_BUG_ON a non-zero _count.  Block users that
do direct I/O can result in the aoe driver seeing compound page tails in
bios.  In that case, the same logic works as long as the head of the
compound page is used instead of the tails.  This patch handles compound
pages and does not BUG.

It relies on the block layer user leaving the relationship between the
page tail and its head alone for the duration between the submission of
the bio and its completion, whether successful or not.

Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agomicroblaze: fix clone syscall
Michal Simek [Tue, 13 Aug 2013 23:00:53 +0000 (16:00 -0700)]
microblaze: fix clone syscall

Fix inadvertent breakage in the clone syscall ABI for Microblaze that
was introduced in commit f3268edbe6fe ("microblaze: switch to generic
fork/vfork/clone").

The Microblaze syscall ABI for clone takes the parent tid address in the
4th argument; the third argument slot is used for the stack size.  The
incorrectly-used CLONE_BACKWARDS type assigned parent tid to the 3rd
slot.

This commit restores the original ABI so that existing userspace libc
code will work correctly.

All kernel versions from v3.8-rc1 were affected.

Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agomm: save soft-dirty bits on file pages
Cyrill Gorcunov [Tue, 13 Aug 2013 23:00:51 +0000 (16:00 -0700)]
mm: save soft-dirty bits on file pages

Andy reported that if file page get reclaimed we lose the soft-dirty bit
if it was there, so save _PAGE_BIT_SOFT_DIRTY bit when page address get
encoded into pte entry.  Thus when #pf happens on such non-present pte
we can restore it back.

Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agomm: save soft-dirty bits on swapped pages
Cyrill Gorcunov [Tue, 13 Aug 2013 23:00:49 +0000 (16:00 -0700)]
mm: save soft-dirty bits on swapped pages

Andy Lutomirski reported that if a page with _PAGE_SOFT_DIRTY bit set
get swapped out, the bit is getting lost and no longer available when
pte read back.

To resolve this we introduce _PTE_SWP_SOFT_DIRTY bit which is saved in
pte entry for the page being swapped out.  When such page is to be read
back from a swap cache we check for bit presence and if it's there we
clear it and restore the former _PAGE_SOFT_DIRTY bit back.

One of the problem was to find a place in pte entry where we can save
the _PTE_SWP_SOFT_DIRTY bit while page is in swap.  The _PAGE_PSE was
chosen for that, it doesn't intersect with swap entry format stored in
pte.

Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agomemcg: don't initialize kmem-cache destroying work for root caches
Andrey Vagin [Tue, 13 Aug 2013 23:00:47 +0000 (16:00 -0700)]
memcg: don't initialize kmem-cache destroying work for root caches

struct memcg_cache_params has a union.  Different parts of this union
are used for root and non-root caches.  A part with destroying work is
used only for non-root caches.

I fixed the same problem in another place v3.9-rc1-16204-gf101a94, but
didn't notice this one.

This patch fixes the kernel panic:

[   46.848187] BUG: unable to handle kernel paging request at 000000fffffffeb8
[   46.849026] IP: [<ffffffff811a484c>] kmem_cache_destroy_memcg_children+0x6c/0xc0
[   46.849092] PGD 0
[   46.849092] Oops: 0000 [#1] SMP
...

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Cc: Glauber Costa <glommer@openvz.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: <stable@vger.kernel.org> [3.9.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoipv6: make unsolicited report intervals configurable for mld
Hannes Frederic Sowa [Tue, 13 Aug 2013 23:03:46 +0000 (01:03 +0200)]
ipv6: make unsolicited report intervals configurable for mld

Commit cab70040dfd95ee32144f02fade64f0cb94f31a0 ("net: igmp:
Reduce Unsolicited report interval to 1s when using IGMPv3") and
2690048c01f32bf45d1c1e1ab3079bc10ad2aea7 ("net: igmp: Allow user-space
configuration of igmp unsolicited report interval") by William Manley made
igmp unsolicited report intervals configurable per interface and corrected
the interval of unsolicited igmpv3 report messages resendings to 1s.

Same needs to be done for IPv6:

MLDv1 (RFC2710 7.10.): 10 seconds
MLDv2 (RFC3810 9.11.): 1 second

Both intervals are configurable via new procfs knobs
mldv1_unsolicited_report_interval and mldv2_unsolicited_report_interval.

(also added .force_mld_version to ipv6_devconf_dflt to bring structs in
line without semantic changes)

v2:
a) Joined documentation update for IPv4 and IPv6 MLD/IGMP
   unsolicited_report_interval procfs knobs.
b) incorporate stylistic feedback from William Manley

v3:
a) add new DEVCONF_* values to the end of the enum (thanks to David
   Miller)

Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: William Manley <william.manley@youview.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>