git.openwrt.org Git - openwrt/staging/blogic.git/log

netfilter: nft_ct: remove family from struct nft_ct

Since we have the context available during destruction again, we can
remove the family from the private structure.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: nf_tables: restore notifications for anonymous set destruction

Since we have the context available again, we can restore notifications
for destruction of anonymous sets.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: nf_tables: restore context for expression destructors

In order to fix set destruction notifications and get rid of unnecessary
members in private data structures, pass the context to expressions'
destructor functions again.

In order to do so, replace various members in the nft_rule_trans structure
by the full context.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: nf_tables: clean up nf_tables_trans_add() argument order

The context argument logically comes first, and this is what every other
function dealing with contexts does.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: nft_hash: bug fixes and resizing

The hash set type is very broken and was never meant to be merged in this
state. Missing RCU synchronization on element removal, leaking chain
refcounts when used as a verdict map, races during lookups, a fixed table
size are probably just some of the problems. Luckily it is currently
never chosen by the kernel when the rbtree type is also available.

Rewrite it to be usable.

The new implementation supports automatic hash table resizing using RCU,
based on Paul McKenney's and Josh Triplett's algorithm "Optimized Resizing
For RCU-Protected Hash Tables" described in [1].

Resizing doesn't require a second list head in the elements, it works by
chosing a hash function that remaps elements to a predictable set of buckets,
only resizing by integral factors and

- during expansion: linking new buckets to the old bucket that contains
  elements for any of the new buckets, thereby creating imprecise chains,
  then incrementally seperating the elements until the new buckets only
  contain elements that hash directly to them.

- during shrinking: linking the hash chains of all old buckets that hash
  to the same new bucket to form a single chain.

Expansion requires at most the number of elements in the longest hash chain
grace periods, shrinking requires a single grace period.

Due to the requirement of having hash chains/elements linked to multiple
buckets during resizing, homemade single linked lists are used instead of
the existing list helpers, that don't support this in a clean fashion.
As a side effect, the amount of memory required per element is reduced by
one pointer.

Expansion is triggered when the load factors exceeds 75%, shrinking when
the load factor goes below 30%. Both operations are allowed to fail and
will be retried on the next insertion or removal if their respective
conditions still hold.

[1] http://dl.acm.org/citation.cfm?id=2002181.2002192

Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: conntrack: remove central spinlock nf_conntrack_lock

nf_conntrack_lock is a monolithic lock and suffers from huge contention
on current generation servers (8 or more core/threads).

Perf locking congestion is clear on base kernel:

-  72.56%  ksoftirqd/6  [kernel.kallsyms]    [k] _raw_spin_lock_bh
   - _raw_spin_lock_bh
      + 25.33% init_conntrack
      + 24.86% nf_ct_delete_from_lists
      + 24.62% __nf_conntrack_confirm
      + 24.38% destroy_conntrack
      + 0.70% tcp_packet
+   2.21%  ksoftirqd/6  [kernel.kallsyms]    [k] fib_table_lookup
+   1.15%  ksoftirqd/6  [kernel.kallsyms]    [k] __slab_free
+   0.77%  ksoftirqd/6  [kernel.kallsyms]    [k] inet_getpeer
+   0.70%  ksoftirqd/6  [nf_conntrack]       [k] nf_ct_delete
+   0.55%  ksoftirqd/6  [ip_tables]          [k] ipt_do_table

This patch change conntrack locking and provides a huge performance
improvement.  SYN-flood attack tested on a 24-core E5-2695v2(ES) with
10Gbit/s ixgbe (with tool trafgen):

Base kernel:   810.405 new conntrack/sec
After patch: 2.233.876 new conntrack/sec

Notice other floods attack (SYN+ACK or ACK) can easily be deflected using:
# iptables -A INPUT -m state --state INVALID -j DROP
# sysctl -w net/netfilter/nf_conntrack_tcp_loose=0

Use an array of hashed spinlocks to protect insertions/deletions of
conntracks into the hash table. 1024 spinlocks seem to give good
results, at minimal cost (4KB memory). Due to lockdep max depth,
1024 becomes 8 if CONFIG_LOCKDEP=y

The hash resize is a bit tricky, because we need to take all locks in
the array. A seqcount_t is used to synchronize the hash table users
with the resizing process.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: conntrack: seperate expect locking from nf_conntrack_lock

Netfilter expectations are protected with the same lock as conntrack
entries (nf_conntrack_lock). This patch split out expectations locking
to use it's own lock (nf_conntrack_expect_lock).

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: avoid race with exp->master ct

Preparation for disconnecting the nf_conntrack_lock from the
expectations code. Once the nf_conntrack_lock is lifted, a race
condition is exposed.

The expectations master conntrack exp->master, can race with
delete operations, as the refcnt increment happens too late in
init_conntrack(). Race is against other CPUs invoking
->destroy() (destroy_conntrack()), or nf_ct_delete() (via timeout
or early_drop()).

Avoid this race in nf_ct_find_expectation() by using atomic_inc_not_zero(),
and checking if nf_ct_is_dying() (path via nf_ct_delete()).

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: conntrack: spinlock per cpu to protect special lists.

One spinlock per cpu to protect dying/unconfirmed/template special lists.
(These lists are now per cpu, a bit like the untracked ct)
Add a @cpu field to nf_conn, to make sure we hold the appropriate
spinlock at removal time.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: trivial code cleanup and doc changes

Changes while reading through the netfilter code.

Added hint about how conntrack nf_conn refcnt is accessed.
And renamed repl_hash to reply_hash for readability

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

Merge git://git./linux/kernel/git/horms/ipvs-next

Via Simon Horman:

====================
* Whitespace cleanup spotted by checkpatch.pl from Tingwei Liu.
* Section conflict cleanup, basically removal of one wrong __read_mostly,
from Andi Kleen.
====================

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

ipvs: Reduce checkpatch noise in ip_vs_lblc.c

Add whitespace after operator and put open brace { on the previous line

Cc: Tingwei Liu <liutingwei@hisense.com>
Cc: lvs-devel@vger.kernel.org
Signed-off-by: Tingwei Liu <tingw.liu@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>

sections, ipvs: Remove useless __read_mostly for ipvs genl_ops

const __read_mostly does not make any sense, because const
data is already read-only. Remove the __read_mostly
for the ipvs genl_ops. This avoids a LTO
section conflict compile problem.

Cc: Wensong Zhang <wensong@linux-vs.org>
Cc: Simon Horman <horms@verge.net.au>
Cc: Patrick McHardy <kaber@trash.net>
Cc: lvs-devel@vger.kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Simon Horman <horms@verge.net.au>

netfilter: ipset: add forceadd kernel support for hash set types

Adds a new property for hash set types, where if a set is created
with the 'forceadd' option and the set becomes full the next addition
to the set may succeed and evict a random entry from the set.

To keep overhead low eviction is done very simply. It checks to see
which bucket the new entry would be added. If the bucket's pos value
is non-zero (meaning there's at least one entry in the bucket) it
replaces the first entry in the bucket. If pos is zero, then it continues
down the normal add process.

This property is useful if you have a set for 'ban' lists where it may
not matter if you release some entries from the set early.

Signed-off-by: Josh Hunt <johunt@akamai.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: move registration message to init from net_init

Commit 1785e8f473 ("netfiler: ipset: Add net namespace for ipset") moved
the initialization print into net_init, which can get called a lot due
to namespaces. Move it back into init, reduce to pr_info.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: kernel: uapi: fix MARKMASK attr ABI breakage

commit 2dfb973c0dcc6d2211 (add markmask for hash:ip,mark data type)
inserted IPSET_ATTR_MARKMASK in-between other enum values, i.e.
changing values of all further attributes. This causes 'ipset list'
segfault on existing kernels since ipset no longer finds
IPSET_ATTR_MEMSIZE (it has a different value on kernel side).

Jozsef points out it should be moved below IPSET_ATTR_MARK which
works since there is some extra reserved space after that value.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Prepare the kernel for create option flags when no extension is needed

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: add markmask for hash:ip,mark data type

Introduce packet mark mask for hash:ip,mark data type. This allows to
set mark bit filter for the ip set.

Change-Id: Id8dd9ca7e64477c4f7b022a1d9c1a5b187f1c96e

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: add hash:ip,mark data type to ipset

Introduce packet mark support with new ip,mark hash set. This includes
userspace and kernelspace code, hash:ip,mark set tests and man page
updates.

The intended use of ip,mark set is similar to the ip:port type, but for
protocols which don't use a predictable port number. Instead of port
number it matches a firewall mark determined by a layer 7 filtering
program like opendpi.

As well as allowing or blocking traffic it will also be used for
accounting packets and bytes sent for each protocol.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Add hash: fix coccinelle warnings

net/netfilter/ipset/ip_set_hash_netnet.c:115:8-9: WARNING: return of 0/1 in function 'hash_netnet4_data_list' with return type bool
/c/kernel-tests/src/cocci/net/netfilter/ipset/ip_set_hash_netnet.c:338:8-9: WARNING: return of 0/1 in function 'hash_netnet6_data_list' with return type bool

Return statements in functions returning bool should use
true/false instead of 1/0.
Generated by: coccinelle/misc/boolreturn.cocci

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: ipset: Follow manual page behavior for SET target on list:set

ipset(8) for list:set says:
  The match will try to find a matching entry in the sets and the
  target will try to add an entry to the first set to which it can
  be added.

However real behavior is bit differ from described. Consider example:

# ipset create test-1-v4 hash:ip family inet
# ipset create test-1-v6 hash:ip family inet6
# ipset create test-1 list:set
# ipset add test-1 test-1-v4
# ipset add test-1 test-1-v6

# iptables  -A INPUT -p tcp --destination-port 25 -j SET --add-set test-1 src
# ip6tables -A INPUT -p tcp --destination-port 25 -j SET --add-set test-1 src

And then when iptables/ip6tables rule matches packet IPSET target
tries to add src from packet to the list:set test-1 where first
entry is test-1-v4 and the second one is test-1-v6.

For IPv4, as it first entry in test-1 src added to test-1-v4
correctly, but for IPv6 src not added!

Placing test-1-v6 to the first element of list:set makes behavior
correct for IPv6, but brokes for IPv4.

This is due to result, returned from ip_set_add() and ip_set_del() from
net/netfilter/ipset/ip_set_core.c when set in list:set equires more
parameters than given or address families do not match (which is this
case).

It seems wrong returning 0 from ip_set_add() and ip_set_del() in
this case, as 0 should be returned only when an element successfuly
added/deleted to/from the set, contrary to ip_set_test() which
returns 0 when no entry exists and >0 when entry found in set.

Signed-off-by: Sergey Popovich <popovich_sergei@mail.ru>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

netfilter: nf_tables: add optional user data area to rules

This allows us to store user comment strings, but it could be also
used to store any kind of information that the user application needs
to link to the rule.

Scratch 8 bits for the new ulen field that indicates the length the
user data area. 4 bits from the handle (so it's 42 bits long, according
to Patrick, it would last 139 years with 1000 new rules per second)
and 4 bits from dlen (so the expression data area is 4K, which seems
sufficient by now even considering the compatibility layer).

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Patrick McHardy <kaber@trash.net>

netfilter: nfnetlink_log: remove unused code

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: nf_tables: accept QUEUE/DROP verdict parameters

Allow userspace to specify the queue number or the errno code for QUEUE
and DROP verdicts.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: nf_tables: add nft_dereference() macro

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: nfnetlink: add rcu_dereference_protected() helpers

Add a lockdep_nfnl_is_held() function and a nfnl_dereference() macro for
RCU dereferences protected by a NFNL subsystem mutex.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: ip_set: rename nfnl_dereference()/nfnl_set()

The next patch will introduce a nfnl_dereference() macro that actually
checks that the appropriate mutex is held and therefore needs a
subsystem argument.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: nft_ct: labels get support

This also adds NF_CT_LABELS_MAX_SIZE so it can be re-used
as BUILD_BUG_ON in nft_ct.

At this time, nft doesn't yet support writing to the label area;
when this changes the label->words handling needs to be moved
out of xt_connlabel.c into nf_conntrack_labels.c.

Also removes a useless run-time check: words cannot grow beyond
4 (32 bit) or 2 (64bit) since xt_connlabel enforces a maximum of
128 labels.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: xt_ipcomp: Use ntohs to ease sparse warning

0-DAY kernel build testing backend reported:

sparse warnings: (new ones prefixed by >>)

>> >> net/netfilter/xt_ipcomp.c:63:26: sparse: restricted __be16 degrades to integer
>> >> net/netfilter/xt_ipcomp.c:63:26: sparse: cast to restricted __be32

Fix this by using ntohs without shifting.

Tested with: make C=1 CF=-D__CHECK_ENDIAN__

Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: remove double colon

This is C not shell script

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

Merge git://git./linux/kernel/git/davem/net

Conflicts:
drivers/net/bonding/bond_3ad.h
drivers/net/bonding/bond_main.c

Two minor conflicts in bonding, both of which were overlapping
changes.

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
"Lots of little small things, nothing too major: nouveau regression
  fixes, vmware fixes for the new hw support, memory leaks in error path
  fixes"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (31 commits)
  drm/radeon/ni: fix typo in dpm sq ramping setup
  drm/radeon/si: fix typo in dpm sq ramping setup
  drm/radeon: fix CP semaphores on CIK
  drm/radeon: delete a stray tab
  drm/radeon: fix display tiling setup on SI
  drm/radeon/dpm: reduce r7xx vblank mclk threshold to 200
  drm/radeon: fill in DRM_CAPs for cursor size
  drm: add DRM_CAPs for cursor size
  drm/radeon: unify bpc handling
  drm/ttm: Fix memory leak in ttm_agp_backend.c
  drm/ttm: declare 'struct device' in ttm_page_alloc.h
  drm/nouveau: fix TTM_PL_TT memtype on pre-nv50
  drm/nv50/disp: use correct register to determine DP display bpp
  drm/nouveau/fb: use correct ram oclass for nv1a hardware
  drm/nv50/gr: add missing nv_error parameter priv
  drm/nouveau: fix ENG_RUNLIST register address
  drm/nv4c/bios: disallow retrieving from prom on nv4x igp's
  drm/nv4c/vga: decode register is in a different place on nv4x igp's
  drm/nv4c/mc: nv4x igp's have a different msi rearm register
  drm/nouveau: set irq_enabled manually
  ...

Merge branch 'for-linus' of git://git./linux/kernel/git/jikos/hid

Pull HID update from Jiri Kosina:

- fixes for several bugs in incorrect allocations of buffers by David
   Herrmann and Benjamin Tissoires.

- support for a few new device IDs by Archana Patni, Benjamin
   Tissoires, Huei-Horng Yo, Reyad Attiyat and Yufeng Shen

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
  HID: hyperv: make sure input buffer is big enough
  HID: Bluetooth: hidp: make sure input buffers are big enough
  HID: hid-sensor-hub: quirk for STM Sensor hub
  HID: apple: add Apple wireless keyboard 2011 JIS model support
  HID: fix buffer allocations
  HID: multitouch: add FocalTech FTxxxx support
  HID: microsoft: Add ID's for Surface Type/Touch Cover 2
  HID: usbhid: quirk for CY-TM75 75 inch Touch Overlay

Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) kvaser CAN driver has fixed limits of some of it's table, validate
    that we won't exceed those limits at probe time.  Fix from Olivier
    Sobrie.

2) Fix rtl8192ce disabling interrupts for too long, from Olivier
    Langlois.

3) Fix botched shift in ath5k driver, from Dan Carpenter.

4) Fix corruption of deferred packets in TIPC, from Erik Hugne.

5) Fix newlink error path in macvlan driver, from Cong Wang.

6) Fix netpoll deadlock in bonding, from Ding Tianhong.

7) Handle GSO packets properly in forwarding path when fragmentation is
    necessary on egress, from Florian Westphal.

8) Fix axienet build errors, from Michal Simek.

9) Fix refcounting of ubufs on tx in vhost net driver, from Michael S
    Tsirkin.

10) Carrier status isn't set properly in hyperv driver, from Haiyang
    Zhang.

11) Missing pci_disable_device() in tulip_remove_one), from Ingo Molnar.

12) AF_PACKET qdisc bypass mode doesn't adhere to driver provided TX
    queue selection method.  Add a fallback method mechanism to fix this
    bug, from Daniel Borkmann.

13) Fix regression in link local route handling on GRE tunnels, from
    Nicolas Dichtel.

14) Bonding can assign dup aggregator IDs in some sequences of
    configuration, fix by making the allocation counter per-bond instead
    of global.  From Jiri Bohac.

15) sctp_connectx() needs compat translations, from Daniel Borkmann.

16) Fix of_mdio PHY interrupt parsing, from Ben Dooks

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (62 commits)
  MAINTAINERS: add entry for the PHY library
  of_mdio: fix phy interrupt passing
  net: ethernet: update dependency and help text of mvneta
  NET: fec: only enable napi if we are successful
  af_packet: remove a stray tab in packet_set_ring()
  net: sctp: fix sctp_connectx abi for ia32 emulation/compat mode
  ipv4: fix counter in_slow_tot
  irtty-sir.c: Do not set_termios() on irtty_close()
  bonding: 802.3ad: make aggregator_identifier bond-private
  usbnet: remove generic hard_header_len check
  gre: add link local route when local addr is any
  batman-adv: fix potential kernel paging error for unicast transmissions
  batman-adv: avoid double free when orig_node initialization fails
  batman-adv: free skb on TVLV parsing success
  batman-adv: fix TT CRC computation by ensuring byte order
  batman-adv: fix potential orig_node reference leak
  batman-adv: avoid potential race condition when adding a new neighbour
  batman-adv: properly check pskb_may_pull return value
  batman-adv: release vlan object after checking the CRC
  batman-adv: fix TT-TVLV parsing on OGM reception
  ...

Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm

Pull ARM fixes from Russell King:
"A range of ARM fixes.  Biggest change is the stage-2 attributes used
  for for hyp mode which were wrong.  I've killed some bits in a couple
  of DT files which turned out not to be required, and a few other
  fixes.

  One fix touches code outside of arch/arm, which is related to sorting
  out the DMA masks correctly.  There is a long standing issue with the
  conversion from PFNs to addresses where people assume that shifting an
  unsigned long left by PAGE_SHIFT results in a correct address.  This
  is not the case with C: the integer promotion happens at assignment
  after evaluation.  This fixes the recently introduced dma_max_pfn()
  function, but there's a number of other places where we try this
  directly on an unsigned long in the mm code"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: 7957/1: add DSB after icache flush in __flush_icache_all()
  Fix uses of dma_max_pfn() when converting to a limiting address
  ARM: 7955/1: spinlock: ensure we have a compiler barrier before sev
  ARM: 7953/1: mm: ensure TLB invalidation is complete before enabling MMU
  ARM: 7952/1: mm: Fix the memblock allocation for LPAE machines
  ARM: 7950/1: mm: Fix stage-2 device memory attributes
  ARM: dts: fix spdif pinmux configuration

Merge tag 'jfs-3.14-rc4' of git://github.com/kleikamp/linux-shaggy

Pull jfs fix from David Kleikamp:
"Another ACL regression. This one more subtle"

* tag 'jfs-3.14-rc4' of git://github.com/kleikamp/linux-shaggy:
jfs: set i_ctime when setting ACL

rtnl: make ifla_policy static

The only place this is used outside rtnetlink.c is veth. So provide
wrapper function for this usage.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>

hsr: Use ether_addr_copy

It's slightly smaller/faster for some architectures.
Make sure def_multicast_addr is __aligned(2)

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

MAINTAINERS: add entry for the PHY library

The PHY library has been subject to some changes, new drivers and DT
interactions over the past few months. Add myself as a maintainer for
the core PHY library parts and drivers. Make sure the PHY library entry
also covers the Device Tree files which have a close interaction with
the MDIO bus, PHY connection and Ethernet PHY mode parsing.

CC: Grant Likely <grant.likely@linaro.org>
CC: Shaohui Xie <shaohui.xie@freescale.com>
CC: Andy Fleming <afleming@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

of_mdio: fix phy interrupt passing

The of_mdiobus_register_phy() is not setting phy->irq thus causing
some drivers to incorrectly assume that the PHY does not have an
IRQ associated with it. Not only do some drivers report no IRQ
they do not install an interrupt handler for the PHY.

Simplify the code setting irq and set the phy->irq at the same
time so that we cover the following issues, which should cover
all the cases the code will find:

- Set phy->irq if node has irq property and mdio->irq is NULL
- Set phy->irq if node has no irq and mdio->irq is not NULL
- Leave phy->irq as PHY_POLL default if none of the above

This fixes the issue:
net eth0: attached PHY 1 (IRQ -1) to driver Micrel KSZ8041RNLI

to the correct:
net eth0: attached PHY 1 (IRQ 416) to driver Micrel KSZ8041RNLI

Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: remove some unused include in flowlabel

These include are here since kernel 2.2.7, but probably never used.

Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

ieee802154: fix faulty check in set_phy_params api

phy_set_csma_params has a redundant (and impossible) check for
"retries", found by smatch. The check was supposed to be for
frame_retries, but wasn't moved during development when
phy_set_frame_retries was introduced. Also, maxBE >= 3 as required by
the standard is not enforced.

Remove the redundant check, assure max_be >= 3 and check -1 <=
frame_retries <= 7 in the correct function.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ethernet: update dependency and help text of mvneta

With the introduction of the support for Armada 375 and Armada 38x,
the hidden Kconfig option MACH_ARMADA_370_XP is being renamed to
MACH_MVEBU_V7. Therefore, the dependency that was used for the mvneta
driver can no longer work. This commit replaces this dependency by a
dependency on PLAT_ORION, which is used similarly for the mv643xx_eth
driver.

In addition to this, it takes this opportunity to adjust the
description and help text to indicate that the driver can is also used
for Armada 38x. Note that Armada 375 cannot use this driver as it has
a completely different networking unit, which will require a separate
driver.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

NET: fec: only enable napi if we are successful

If napi is left enabled after a failed attempt to bring the interface
up, we BUG:

fec 2188000.ethernet eth0: no PHY, assuming direct connection to switch
libphy: PHY fixed-0:00 not found
fec 2188000.ethernet eth0: could not attach to PHY
------------[ cut here ]------------
kernel BUG at include/linux/netdevice.h:502!
Internal error: Oops - BUG: 0 [#1] SMP ARM
...
PC is at fec_enet_open+0x4d0/0x500
LR is at __dev_open+0xa4/0xfc

Only enable napi after we are past all the failure paths.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

af_packet: remove a stray tab in packet_set_ring()

At first glance it looks like there is a missing curly brace but
actually the code works the same either way. I have adjusted the
indenting but left the code the same.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Documentation: broadcom-bcmgenet: fix address and cells properties

This patch fixes a typo in the Device Tree binding for the
leading '#'.

Reported-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: align tipc function names with common naming practice in the network

Rename the following functions, which are shorter and more in line
with common naming practice in the network subsystem.

tipc_bclink_send_msg->tipc_bclink_xmit
tipc_bclink_recv_pkt->tipc_bclink_rcv
tipc_disc_recv_msg->tipc_disc_rcv
tipc_link_send_proto_msg->tipc_link_proto_xmit
link_recv_proto_msg->tipc_link_proto_rcv
link_send_sections_long->tipc_link_iovec_long_xmit
tipc_link_send_sections_fast->tipc_link_iovec_xmit_fast
tipc_link_send_sync->tipc_link_sync_xmit
tipc_link_recv_sync->tipc_link_sync_rcv
tipc_link_send_buf->__tipc_link_xmit
tipc_link_send->tipc_link_xmit
tipc_link_send_names->tipc_link_names_xmit
tipc_named_recv->tipc_named_rcv
tipc_link_recv_bundle->tipc_link_bundle_rcv
tipc_link_dup_send_queue->tipc_link_dup_queue_xmit
link_send_long_buf->tipc_link_frag_xmit

tipc_multicast->tipc_port_mcast_xmit
tipc_port_recv_mcast->tipc_port_mcast_rcv
tipc_port_reject_sections->tipc_port_iovec_reject
tipc_port_recv_proto_msg->tipc_port_proto_rcv
tipc_connect->tipc_port_connect
__tipc_connect->__tipc_port_connect
__tipc_disconnect->__tipc_port_disconnect
tipc_disconnect->tipc_port_disconnect
tipc_shutdown->tipc_port_shutdown
tipc_port_recv_msg->tipc_port_rcv
tipc_port_recv_sections->tipc_port_iovec_rcv

release->tipc_release
accept->tipc_accept
bind->tipc_bind
get_name->tipc_getname
poll->tipc_poll
send_msg->tipc_sendmsg
send_packet->tipc_send_packet
send_stream->tipc_send_stream
recv_msg->tipc_recvmsg
recv_stream->tipc_recv_stream
connect->tipc_connect
listen->tipc_listen
shutdown->tipc_shutdown
setsockopt->tipc_setsockopt
getsockopt->tipc_getsockopt

Above changes have no impact on current users of the functions.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sh_eth: add device tree support

Add support of the device tree probing for the Renesas SH-Mobile SoCs
documenting the device tree binding as necessary.

This work is loosely based on the original patch by Nobuhiro Iwamatsu
<nobuhiro.iwamatsu.yj@renesas.com>.

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'ttm-fixes-3.14-2014-02-18' of git://people.freedesktop.org/~thomash/linux into drm-fixes

Pull request of 2014-02-18

One compile fix and one memory leak.

* tag 'ttm-fixes-3.14-2014-02-18' of git://people.freedesktop.org/~thomash/linux:
drm/ttm: Fix memory leak in ttm_agp_backend.c
drm/ttm: declare 'struct device' in ttm_page_alloc.h

Merge tag 'vmwgfx-fixes-3.14-2014-02-18' of git://people.freedesktop.org/~thomash/linux into drm-fixes

Pull request of 2014-02-18.

Nothing special. The biggest change is adding a couple of command defines and
packing the command data correctly.

* tag 'vmwgfx-fixes-3.14-2014-02-18' of git://people.freedesktop.org/~thomash/linux:
  drm/vmwgfx: Fix command defines and checks
  drm/vmwgfx: Fix possible integer overflow
  drm/vmwgfx: Remove stray const
  drm/vmwgfx: unlock on error path in vmw_execbuf_process()
  drm/vmwgfx: Get maximum mob size from register SVGA_REG_MOB_MAX_SIZE
  drm/vmwgfx: Fix a couple of sparse warnings and errors

Merge branch 'drm-fixes-3.14' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

Fix for 128x128 cursors, along with some misc fixes.

* 'drm-fixes-3.14' of git://people.freedesktop.org/~agd5f/linux:
  drm/radeon/ni: fix typo in dpm sq ramping setup
  drm/radeon/si: fix typo in dpm sq ramping setup
  drm/radeon: fix CP semaphores on CIK
  drm/radeon: delete a stray tab
  drm/radeon: fix display tiling setup on SI
  drm/radeon/dpm: reduce r7xx vblank mclk threshold to 200
  drm/radeon: fill in DRM_CAPs for cursor size
  drm: add DRM_CAPs for cursor size
  drm/radeon: unify bpc handling

DT: net: document Ethernet bindings in one place

This patch is an attempt to gather the Ethernet related bindings in one file,
like it's done in the MMC and some other subsystems. It should save some of
the trouble of documenting several properties over and over in each binding
document, instead only making reference to the main file.

I have used the Embedded Power Architecture(TM) Platform Requirements (ePAPR)
standard as a base for the properties description, also documenting some ad-hoc
properties that have been introduced over time despite having direct analogs in
ePAPR.

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Drivers: net: ethernet: 3com: 3c589_cs fixed coding style issues

checkpatch.pl clean-up, from 14 error/ 277 warnings, to 0 errors, 7 warnings

Signed-off-by: Justin van Wijngaarden <justinvanwijngaarden@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'for-davem' of git://git./linux/kernel/git/linville/wireless

John W. Linville says:

====================
Please pull this batch of fixes intended for the 3.14 stream...

For the iwlwifi one, Emmanuel says:

"As explicitly written in the commit message, we prefer to disable Tx
AMPDU on NICs supported by iwldvm. This feature gives a big boost in
Tx performance, but the firmware is buggy and we can't rely on it.
Our hope is that most of the users out there want wifi to surf on
the web which means that they care more for Rx traffic than for Tx.
People who want to enable it can do so with the help of a module
parameter."

On top of that...

Dan Carpenter fixes a typo/thinko in ath5k.

Olivier Langlois fixes a couple of rtlwifi issues, one which leaves
IRQs disabled too long (causing a variety of problems elsewhere),
and one which fixes an incorrect return code when failing to enable
the NIC.

Russell King fixes a NULL pointer dereference in hostap.

Stanislaw Gruszka fixes a DMA coherence issue in the rtl8187 driver.

Please let me know if there are problems!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'bonding'

Veaceslav Falico says:

====================
bonding: add an option to rely on unvalidated arp packets

v4 -> v5:
Again per Nik's advise correct the bond_opts restrictions for arp_validate
- set it the same as arp_interval.

v3 -> v4:
Per Nikolay's advise, remove the new bond_opts restriction on modes setting
for arp_validate.

v2 -> v3:
Per Jay's advise, use the 'filter' keyword instead of 'arp' one, and use
his text for documentation. Also, rebase on the latest net-next. Sorry for
the delay, didn't manage to send it before net-next was closed.

v1 -> v2:
Don't remove the 'all traffic' functionality - rather, add new arp_validate
options to specify that we want *only* unvalidated arps.

Currently, if arp_validate is off (0), slave_last_rx() returns the
slave->dev->last_rx, which is always updated on *any* packet received by
slave, and not only arps. This means that, if the validation of arps is
off, we're treating *any* incoming packet as a proof of slave being up, and
not only arps.

This might seem logical at the first glance, however it can cause a lot of
troubles and false-positives, one example would be:

The arp_ip_target is NOT accessible, however someone in the broadcast domain
spams with any broadcast traffic. This way bonding will be tricked that the
slave is still up (as in - can access arp_ip_target), while it's not.

The net_device->last_rx is already used in a lot of drivers (even though the
comment states to NOT do it :)), and it's also ugly to modify it from bonding.

However, some loadbalance setups might rely on the fact that even non-arp
traffic is a sign of slave being up - and we definitely can't break anyones
config - so an extension to arp_validate is needed.

So, to fix this, add an option for the user to specify if he wants to
filter out non-arp traffic on unvalidated slaves, remove the last_rx from
bonding, *always* call bond_arp_rcv() in slave's rx_handler (which is
bond_handle_frame), and if we spot an arp there with this option on - update
the slave->last_arp_rx - and use it instead of net_device->last_rx. Finally,
rename last_arp_rx to last_rx to reflect the changes.

Also rename slave->jiffies to ->last_link_up, to reflect better its
meaning, add the new option's documentation and update the arp_validate one
to be a bit more descriptive.
====================

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: rename last_arp_rx to last_rx

To reflect the new meaning.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: trivial: rename slave->jiffies to ->last_link_up

slave->jiffies is updated every time the slave becomes active, which, for
bonding, means that its link is 'up'.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: remove useless updating of slave->dev->last_rx

Now that all the logic is handled via last_arp_rx, we don't need to use
last_rx.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: use last_arp_rx in bond_loadbalance_arp_mon()

Now that last_arp_rx correctly show the last time we've received an ARP, we
can use it safely instead of slave->dev->last_rx.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: use last_arp_rx in slave_last_rx()

Now that last_arp_rx really has the last time we've received any (validated or
not) ARP, we can use it in slave_last_rx() instead of slave->dev->last_rx.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: use the new options to correctly set last_arp_rx

Now that the options are in place - arp_validate can be set to receive all
the traffic or only arp packets to verify if the slave is up, when the
slave isn't validated.

CC: Rob Landley <rob@landley.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Nikolay Aleksandrov <nikolay@redhat.com>
CC: Ding Tianhong <dingtianhong@huawei.com>
CC: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: document the new _arp options for arp_validate

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: extend arp_validate to be able to receive unvalidated arp-only traffic

Currently we can either receive any traffic as a proff of slave being up,
or only *validated* arp traffic (i.e. with src/dst ip checked).

Add an option to be able to specify if we want to receive non-validated arp
traffic only.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: always set recv_probe to bond_arp_rcv in arp monitor

Currently we only set bond_arp_rcv() if we're using arp_validate, however
this makes us skip updating last_arp_rx if we're not validating incoming
ARPs - thus, if arp_validate is off, last_arp_rx will never be updated.

Fix this by always setting up recv_probe = bond_arp_rcv, even if we're not
using arp_validate.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: always update last_arp_rx on packet recieve

Currently we're updating the last_arp_rx only when we've validate the
packet, however afterwards we use it as 'ANY last packet received', but not
only validated ARPs.

Fix this by updating it in case of any packet received. It won't break the
arp_validation=0 because we, anyway, return the correct slave->dev->last_rx in
slave_last_rx().

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: permit using arp_validate with non-ab modes

Currently it's disabled because it's sometimes hard, in typical configs, to
make it work - because of the nature how the loadbalance modes work - as
it's hard to deliver valid arp replies to correct slaves by the switch.

However we still can use arp_validation in loadbalance with several other
configs, per example with arp_validate == 2 for backup with one broadcast
domain, without the switch(es) doing any balancing - this way we'd be (a
bit more) sure that the slave is up.

So, enable it to let users decide which one works/suits them best. Also
correct the mode limitation from BOND_OPT_ARP_VALIDATE.

CC: Nikolay Aleksandrov <nikolay@redhat.com>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Acked-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: remove bond->lock from bond_arp_rcv

We're always called with rcu_read_lock() held (bond_arp_rcv() is only
called from bond_handle_frame(), which is rx_handler and always called
under rcu from __netif_receive_skb_core() ).

The slave active/passive and/or bonding params can change in-flight, however
we don't really care about that - we only modify the last time packet was
received, which is harmless.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Acked-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'r8152'

Hayes Wang says:

====================
r8152: improvement and new features

Change some flows or behavior to improve the efficiency or make the
code readable. Besides, support WOL and runtime suspend.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

r8152: support get_msglevel and set_msglevel

Support get_msglevel and set_msglevel.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8152: set disable_hub_initiated_lpm

Set disable_hub_initiated_lpm = 1.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8152: replace netif_rx with netif_receive_skb

Replace netif_rx with netif_receive_skb to avoid disabling irq frequently
for increasing the efficiency.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8152: disable teredo for RTL8152

Disable teredo for RTL8152 by default.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8152: support runtime suspend

Support runtime suspend for RTL8152 and RTL8153.

Move tx_bottom() from tasklet to delayed_work. That avoids to
transmit tx packets after calling autosuspend.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8152: support WOL

Support WOL for RTL8152 and RTL8153.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8152: move some functions from probe to open

Add up method for rtl_ops and asign relative functions. Move
clear_bp() and hw_phy_cfg() from init method to up method of rtl_ops.
Call rtl_ops.up() for ndo_open() and rtl_ops.down for ndo_stop().

Replace allocating the memory in probe() with in ndo_open().

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8152: combine PHY reset with set_speed

PHY reset is necessary after some hw settings. However, it would
cause the linking down, and so does the set_speed function. Combine
the PHY reset with set_speed function. That could reduce the frequency
of linking down and accessing the PHY register.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8152: clear BMCR_PDOWN

Modify the method of enabling the PHY to clear BMCR_PDOWN only.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8152: reduce the frequency of spin_lock

Replace getting one item from a list with getting the whole list one time.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8152: load the default MAC address

Except for RTL_VER_01, replace loading the MAC address from PLA_IDR
with from PLA_BACKUP. The default MAC address may be modified by
the other OS, so the PLA_IDR may be not the default MAC address.

The data in the PLA_BACKUP address of the RTL_VER_01 may be destoryed,
so load MAC address from PLA_IDR for RTL_VER_01.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8152: replace some types from int to bool

Modify the following functions.
- r8153_u1u2en
- r8153_u2p3en
- r8153_power_cut_en

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8152: add three functions

Replace some codes with the following three functions.
- rtl_drop_queued_tx
- rxdy_gated_en
- r8152_power_cut_en

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8152: move some functions

Move the following functions which is for the further coding.
- rtl_clear_bp
- r8153_clear_bp
- r8153_teredo_off
- r8152b_disable_aldps
- r8152b_enable_aldps
- r8152b_hw_phy_cfg

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'cxgb4'

Hariprasad Shenai says:

====================
Adds support for Chelsio T5 40G adapter and Misc. fixes

This patch series adds support for Chelsio T5 40G adapters and provides
miscelleneous fixes for cxgb4 driver.

It also adds device ids of two new T5 adapters.

We would like to request this patch series to get merged via David Miller's
'net-next' tree.

We have included all the maintainers of respective drivers. Kindly review the
change and let us know in case of any review comments.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4: Add more PCI device ids.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4: Don't assume LSO only uses SGL path in t4_eth_xmit()

Also, modify is_eth_imm() to return header length so it doesn't
have to be recomputed in calc_tx_flits().

Based on original work by Mike Werner <werner@chelsio.com>

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4: Remove unused registers and add missing ones

Remove unused registers for registers list, and add missing ones
Based on original work by Santosh Rastapur <santosh@chelsio.com>

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4: Query firmware for T5 ULPTX MEMWRITE DSGL capabilities

Query firmware to see whether we're allowed to use T5 ULPTX MEMWRITE DSGL
capabilities. Also pass that information to Upper Layer Drivers via the
new (struct cxgb4_lld_info).ulptx_memwrite_dsgl boolean.

Based on original work by Casey Leedom <leedom@chelsio.com>

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4: LE-Workaround is not atomic in firmware

The LE workaround in firmware is not atomic and fw_ofld_connection_wrs must not interleave.
Therefore, when the workaround is enabled, we need to send all ctrlq WRs on a single ctrl queue.

Based on original work by Santosh Rastapur <santosh@chelsio.com>

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4: Allow >10G ports to have multiple queues

Based on original work by Divy Le Ray.

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4: Print adapter VPD Part Number instead of Engineering Change field

When we attach to adapter, print VPD Part Number instead of Engineering Change field.
Based on original work by Casey Leedom <leedom@chelsio.com>

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4: Add support to recognize 40G links

Also, create a new Common Code interface to translate Firmware Port Technology
Type values (enum fw_port_type) to string descriptions. This will allow us
to maintain the description translation table in one place rather than in
every driver.

Based on original work by Scott Bardone and Casey Leedom <leedom@chelsio.com>

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: sctp: fix sctp_connectx abi for ia32 emulation/compat mode

SCTP's sctp_connectx() abi breaks for 64bit kernels compiled with 32bit
emulation (e.g. ia32 emulation or x86_x32). Due to internal usage of
'struct sctp_getaddrs_old' which includes a struct sockaddr pointer,
sizeof(param) check will always fail in kernel as the structure in
64bit kernel space is 4bytes larger than for user binaries compiled
in 32bit mode. Thus, applications making use of sctp_connectx() won't
be able to run under such circumstances.

Introduce a compat interface in the kernel to deal with such
situations by using a 'struct compat_sctp_getaddrs_old' structure
where user data is copied into it, and then sucessively transformed
into a 'struct sctp_getaddrs_old' structure with the help of
compat_ptr(). That fixes sctp_connectx() abi without any changes
needed in user space, and lets the SCTP test suite pass when compiled
in 32bit and run on 64bit kernels.

Fixes: f9c67811ebc0 ("sctp: Fix regression introduced by new sctp_connectx api")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge

Included changes:
- fix soft-interface MTU computation
- fix bogus pointer mangling when parsing the TT-TVLV
  container. This bug led to a wrong memory access.
- fix memory leak by properly releasing the VLAN object
  after CRC check
- properly check pskb_may_pull() return value
- avoid potential race condition while adding new neighbour
- fix potential memory leak by removing all the references
  to the orig_node object in case of initialization failure
- fix the TT CRC computation by ensuring that every node uses
  the same byte order when hosts with different endianess are
  part of the same network
- fix severe memory leak by freeing skb after a successful
  TVLV parsing
- avoid potential double free when orig_node initialization
  fails
- fix potential kernel paging error caused by the usage of
  the old value of skb->data after skb reallocation

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'msix'

Alexander Gordeev says:

====================
net: Use pci_enable_msix_range() instead of pci_enable_msix()

As result of deprecation of MSI-X/MSI enablement functions
pci_enable_msix() and pci_enable_msi_block() all drivers
using these two interfaces need to be updated to use the
new pci_enable_msi_range() and pci_enable_msix_range()
interfaces.

Cc: e1000-devel@lists.sourceforge.net
Cc: linux-driver@qlogic.com
Cc: linux-net-drivers@solarflare.com
Cc: linux-pci@vger.kernel.org
Cc: linux-rdma@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: pv-drivers@vmware.com
Cc: wil6210@qca.qualcomm.com
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

wil6210: Use pci_enable_msi_range() instead of pci_enable_msi_block()

As result of deprecation of MSI-X/MSI enablement functions
pci_enable_msix() and pci_enable_msi_block() all drivers
using these two interfaces need to be updated to use the
new pci_enable_msi_range() and pci_enable_msix_range()
interfaces.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Cc: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: wil6210@qca.qualcomm.com
Cc: netdev@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Acked-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

vmxnet3: Use pci_enable_msix_range() instead of pci_enable_msix()

As result of deprecation of MSI-X/MSI enablement functions
pci_enable_msix() and pci_enable_msi_block() all drivers
using these two interfaces need to be updated to use the
new pci_enable_msi_range() and pci_enable_msix_range()
interfaces.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Cc: Shreyas Bhatewara <sbhatewara@vmware.com>
Cc: pv-drivers@vmware.com
Cc: netdev@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>

vmxnet3: Fix MSI-X/MSI enablement code

This update cleans up the MSI-X/MSI enablement code, fixes
vmxnet3_acquire_msix_vectors() invalid return values and
enables a dead code in case VMXNET3_LINUX_MIN_MSIX_VECT
MSI-X vectors were allocated.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Cc: Shreyas Bhatewara <sbhatewara@vmware.com>
Cc: pv-drivers@vmware.com
Cc: netdev@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>

niu: Use pci_enable_msix_range() instead of pci_enable_msix()

As result of deprecation of MSI-X/MSI enablement functions
pci_enable_msix() and pci_enable_msi_block() all drivers
using these two interfaces need to be updated to use the
new pci_enable_msi_range() and pci_enable_msix_range()
interfaces.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jingoo Han <jg1.han@samsung.com>
Cc: netdev@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>

sfc: Use pci_enable_msix_range() instead of pci_enable_msix()

As result of deprecation of MSI-X/MSI enablement functions
pci_enable_msix() and pci_enable_msi_block() all drivers
using these two interfaces need to be updated to use the
new pci_enable_msi_range() and pci_enable_msix_range()
interfaces.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Cc: Shradha Shah <sshah@solarflare.com>
Cc: linux-net-drivers@solarflare.com
Cc: netdev@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Acked-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlge: Use pci_enable_msix_range() instead of pci_enable_msix()

As result of deprecation of MSI-X/MSI enablement functions
pci_enable_msix() and pci_enable_msi_block() all drivers
using these two interfaces need to be updated to use the
new pci_enable_msi_range() and pci_enable_msix_range()
interfaces.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Cc: Shahed Shaikh <shahed.shaikh@qlogic.com>
Cc: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Cc: Ron Mercer <ron.mercer@qlogic.com>
Cc: linux-driver@qlogic.com
Cc: netdev@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Acked-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>