openwrt/staging/blogic.git
12 years agoetherdevice.h: Add ether_addr_equal
Joe Perches [Tue, 8 May 2012 18:56:45 +0000 (18:56 +0000)]
etherdevice.h: Add ether_addr_equal

Add a boolean function to check if 2 ethernet addresses
are the same.

This is to avoid any confusion about compare_ether_addr
returning an unsigned, and not being able to use the
compare_ether_addr function for sorting ala memcmp.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge git://1984.lsi.us.es/net-next
David S. Miller [Wed, 9 May 2012 22:07:44 +0000 (18:07 -0400)]
Merge git://1984.lsi.us.es/net-next

12 years agoe1000e: Fix merge conflict (net->net-next)
Jeff Kirsher [Wed, 9 May 2012 09:23:46 +0000 (02:23 -0700)]
e1000e: Fix merge conflict (net->net-next)

During merge of net to net-next the changes in patch:

e1000e: Fix default interrupt throttle rate not set in NIC HW

got munged in param.c of the e1000e driver.  This rectifies the
merge issues.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonetfilter: hashlimit: byte-based limit mode
Florian Westphal [Mon, 7 May 2012 10:51:45 +0000 (10:51 +0000)]
netfilter: hashlimit: byte-based limit mode

can be used e.g. for ingress traffic policing or
to detect when a host/port consumes more bandwidth than expected.

This is done by optionally making cost to mean
"cost per 16-byte-chunk-of-data" instead of "cost per packet".

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonetfilter: hashlimit: move rateinfo initialization to helper
Florian Westphal [Mon, 7 May 2012 10:51:44 +0000 (10:51 +0000)]
netfilter: hashlimit: move rateinfo initialization to helper

followup patch would bloat main match function too much.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonetfilter: limit, hashlimit: avoid duplicated inline
Florian Westphal [Mon, 7 May 2012 10:51:43 +0000 (10:51 +0000)]
netfilter: limit, hashlimit: avoid duplicated inline

credit_cap can be set to credit, which avoids inlining user2credits
twice. Also, remove inline keyword and let compiler decide.

old:
    684     192       0     876     36c net/netfilter/xt_limit.o
   4927     344      32    5303    14b7 net/netfilter/xt_hashlimit.o
now:
    668     192       0     860     35c net/netfilter/xt_limit.o
   4793     344      32    5169    1431 net/netfilter/xt_hashlimit.o

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonetfilter: add xt_hmark target for hash-based skb marking
Hans Schillstrom [Wed, 2 May 2012 07:49:47 +0000 (07:49 +0000)]
netfilter: add xt_hmark target for hash-based skb marking

The target allows you to create rules in the "raw" and "mangle" tables
which set the skbuff mark by means of hash calculation within a given
range. The nfmark can influence the routing method (see "Use netfilter
MARK value as routing key") and can also be used by other subsystems to
change their behaviour.

[ Part of this patch has been refactorized and modified by Pablo Neira Ayuso ]

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonetfilter: ip6_tables: add flags parameter to ipv6_find_hdr()
Hans Schillstrom [Mon, 23 Apr 2012 03:35:26 +0000 (03:35 +0000)]
netfilter: ip6_tables: add flags parameter to ipv6_find_hdr()

This patch adds the flags parameter to ipv6_find_hdr. This flags
allows us to:

* know if this is a fragment.
* stop at the AH header, so the information contained in that header
  can be used for some specific packet handling.

This patch also adds the offset parameter for inspection of one
inner IPv6 header that is contained in error messages.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agoMerge branch 'master' of git://1984.lsi.us.es/net-next
David S. Miller [Tue, 8 May 2012 18:40:21 +0000 (14:40 -0400)]
Merge branch 'master' of git://1984.lsi.us.es/net-next

12 years agonetfilter: remove ip_queue support
Pablo Neira Ayuso [Tue, 8 May 2012 17:45:28 +0000 (19:45 +0200)]
netfilter: remove ip_queue support

This patch removes ip_queue support which was marked as obsolete
years ago. The nfnetlink_queue modules provides more advanced
user-space packet queueing mechanism.

This patch also removes capability code included in SELinux that
refers to ip_queue. Otherwise, we break compilation.

Several warning has been sent regarding this to the mailing list
in the past month without anyone rising the hand to stop this
with some strong argument.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonetfilter: nf_conntrack: fix explicit helper attachment and NAT
Pablo Neira Ayuso [Thu, 3 May 2012 02:17:45 +0000 (02:17 +0000)]
netfilter: nf_conntrack: fix explicit helper attachment and NAT

Explicit helper attachment via the CT target is broken with NAT
if non-standard ports are used. This problem was hidden behind
the automatic helper assignment routine. Thus, it becomes more
noticeable now that we can disable the automatic helper assignment
with Eric Leblond's:

9e8ac5a netfilter: nf_ct_helper: allow to disable automatic helper assignment

Basically, nf_conntrack_alter_reply asks for looking up the helper
up if NAT is enabled. Unfortunately, we don't have the conntrack
template at that point anymore.

Since we don't want to rely on the automatic helper assignment,
we can skip the second look-up and stick to the helper that was
attached by iptables. With the CT target, the user is in full
control of helper attachment, thus, the policy is to trust what
the user explicitly configures via iptables (no automatic magic
anymore).

Interestingly, this bug was hidden by the automatic helper look-up
code. But it can be easily trigger if you attach the helper in
a non-standard port, eg.

iptables -I PREROUTING -t raw -p tcp --dport 8888 \
-j CT --helper ftp

And you disabled the automatic helper assignment.

I added the IPS_HELPER_BIT that allows us to differenciate between
a helper that has been explicitly attached and those that have been
automatically assigned. I didn't come up with a better solution
(having backward compatibility in mind).

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonetfilter: nf_ct_expect: partially implement ctnetlink_change_expect
Kelvie Wong [Wed, 2 May 2012 14:39:24 +0000 (14:39 +0000)]
netfilter: nf_ct_expect: partially implement ctnetlink_change_expect

This refreshes the "timeout" attribute in existing expectations if one is
given.

The use case for this would be for userspace helpers to extend the lifetime
of the expectation when requested, as this is not possible right now
without deleting/recreating the expectation.

I use this specifically for forwarding DCERPC traffic through:

DCERPC has a port mapper daemon that chooses a (seemingly) random port for
future traffic to go to. We expect this traffic (with a reasonable
timeout), but sometimes the port mapper will tell the client to continue
using the same port. This allows us to extend the expectation accordingly.

Signed-off-by: Kelvie Wong <kelvie@ieee.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonet: export sysctl_[r|w]mem_max symbols needed by ip_vs_sync
Hans Schillstrom [Mon, 30 Apr 2012 06:13:50 +0000 (08:13 +0200)]
net: export sysctl_[r|w]mem_max symbols needed by ip_vs_sync

To build ip_vs as a module sysctl_rmem_max and sysctl_wmem_max
needs to be exported.

The dependency was added by "ipvs: wakeup master thread" patch.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agoipvs: ip_vs_proto: local functions should not be exposed globally
H Hartley Sweeten [Thu, 26 Apr 2012 18:26:15 +0000 (11:26 -0700)]
ipvs: ip_vs_proto: local functions should not be exposed globally

Functions not referenced outside of a source file should be marked
static to prevent it from being exposed globally.

This quiets the sparse warnings:

warning: symbol '__ipvs_proto_data_get' was not declared. Should it be static?

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
12 years agoipvs: ip_vs_ftp: local functions should not be exposed globally
H Hartley Sweeten [Thu, 26 Apr 2012 18:17:28 +0000 (11:17 -0700)]
ipvs: ip_vs_ftp: local functions should not be exposed globally

Functions not referenced outside of a source file should be marked
static to prevent it from being exposed globally.

This quiets the sparse warnings:

warning: symbol 'ip_vs_ftp_init' was not declared. Should it be static?

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
12 years agoipvs: optimize the use of flags in ip_vs_bind_dest
Pablo Neira Ayuso [Tue, 8 May 2012 07:28:19 +0000 (09:28 +0200)]
ipvs: optimize the use of flags in ip_vs_bind_dest

cp->flags is marked volatile but ip_vs_bind_dest
can safely modify the flags, so save some CPU cycles by
using temp variable.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
12 years agoipvs: add support for sync threads
Pablo Neira Ayuso [Tue, 8 May 2012 17:40:30 +0000 (19:40 +0200)]
ipvs: add support for sync threads

Allow master and backup servers to use many threads
for sync traffic. Add sysctl var "sync_ports" to define the
number of threads. Every thread will use single UDP port,
thread 0 will use the default port 8848 while last thread
will use port 8848+sync_ports-1.

The sync traffic for connections is scheduled to many
master threads based on the cp address but one connection is
always assigned to same thread to avoid reordering of the
sync messages.

Remove ip_vs_sync_switch_mode because this check
for sync mode change is still risky. Instead, check for mode
change under sync_buff_lock.

Make sure the backup socks do not block on reading.

Special thanks to Aleksey Chudov for helping in all tests.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Tested-by: Aleksey Chudov <aleksey.chudov@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
12 years agoipvs: reduce sync rate with time thresholds
Julian Anastasov [Tue, 24 Apr 2012 20:46:40 +0000 (23:46 +0300)]
ipvs: reduce sync rate with time thresholds

Add two new sysctl vars to control the sync rate with the
main idea to reduce the rate for connection templates because
currently it depends on the packet rate for controlled connections.
This mechanism should be useful also for normal connections
with high traffic.

sync_refresh_period: in seconds, difference in reported connection
timer that triggers new sync message. It can be used to
avoid sync messages for the specified period (or half of
the connection timeout if it is lower) if connection state
is not changed from last sync.

sync_retries: integer, 0..3, defines sync retries with period of
sync_refresh_period/8. Useful to protect against loss of
sync messages.

Allow sysctl_sync_threshold to be used with
sysctl_sync_period=0, so that only single sync message is sent
if sync_refresh_period is also 0.

Add new field "sync_endtime" in connection structure to
hold the reported time when connection expires. The 2 lowest
bits will represent the retry count.

As the sysctl_sync_period now can be 0 use ACCESS_ONCE to
avoid division by zero.

Special thanks to Aleksey Chudov for being patient with me,
for his extensive reports and helping in all tests.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Tested-by: Aleksey Chudov <aleksey.chudov@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
12 years agoipvs: wakeup master thread
Pablo Neira Ayuso [Tue, 8 May 2012 17:39:49 +0000 (19:39 +0200)]
ipvs: wakeup master thread

High rate of sync messages in master can lead to
overflowing the socket buffer and dropping the messages.
Fixed sleep of 1 second without wakeup events is not suitable
for loaded masters,

Use delayed_work to schedule sending for queued messages
and limit the delay to IPVS_SYNC_SEND_DELAY (20ms). This will
reduce the rate of wakeups but to avoid sending long bursts we
wakeup the master thread after IPVS_SYNC_WAKEUP_RATE (8) messages.

Add hard limit for the queued messages before sending
by using "sync_qlen_max" sysctl var. It defaults to 1/32 of
the memory pages but actually represents number of messages.
It will protect us from allocating large parts of memory
when the sending rate is lower than the queuing rate.

As suggested by Pablo, add new sysctl var
"sync_sock_size" to configure the SNDBUF (master) or
RCVBUF (slave) socket limit. Default value is 0 (preserve
system defaults).

Change the master thread to detect and block on
SNDBUF overflow, so that we do not drop messages when
the socket limit is low but the sync_qlen_max limit is
not reached. On ENOBUFS or other errors just drop the
messages.

Change master thread to enter TASK_INTERRUPTIBLE
state early, so that we do not miss wakeups due to messages or
kthread_should_stop event.

Thanks to Pablo Neira Ayuso for his valuable feedback!

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
12 years agoipvs: always update some of the flags bits in backup
Julian Anastasov [Tue, 24 Apr 2012 20:46:38 +0000 (23:46 +0300)]
ipvs: always update some of the flags bits in backup

As the goal is to mirror the inactconns/activeconns
counters in the backup server, make sure the cp->flags are
updated even if cp is still not bound to dest. If cp->flags
are not updated ip_vs_bind_dest will rely only on the initial
flags when updating the counters. To avoid mistakes and
complicated checks for protocol state rely only on the
IP_VS_CONN_F_INACTIVE bit when updating the counters.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Tested-by: Aleksey Chudov <aleksey.chudov@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
12 years agoipvs: fix ip_vs_try_bind_dest to rebind app and transmitter
Julian Anastasov [Tue, 24 Apr 2012 20:46:37 +0000 (23:46 +0300)]
ipvs: fix ip_vs_try_bind_dest to rebind app and transmitter

Initially, when the synced connection is created we
use the forwarding method provided by master but once we
bind to destination it can be changed. As result, we must
update the application and the transmitter.

As ip_vs_try_bind_dest is called always for connections
that require dest binding, there is no need to validate the
cp and dest pointers.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
12 years agoipvs: remove check for IP_VS_CONN_F_SYNC from ip_vs_bind_dest
Julian Anastasov [Tue, 24 Apr 2012 20:46:36 +0000 (23:46 +0300)]
ipvs: remove check for IP_VS_CONN_F_SYNC from ip_vs_bind_dest

As the IP_VS_CONN_F_INACTIVE bit is properly set
in cp->flags for all kind of connections we do not need to
add special checks for synced connections when updating
the activeconns/inactconns counters for first time. Now
logic will look just like in ip_vs_unbind_dest.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
12 years agoipvs: ignore IP_VS_CONN_F_NOOUTPUT in backup server
Julian Anastasov [Tue, 24 Apr 2012 20:46:35 +0000 (23:46 +0300)]
ipvs: ignore IP_VS_CONN_F_NOOUTPUT in backup server

As IP_VS_CONN_F_NOOUTPUT is derived from the
forwarding method we should get it from conn_flags just
like we do it for IP_VS_CONN_F_FWD_MASK bits when binding
to real server.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
12 years agoipvs: use GFP_KERNEL allocation where possible
Sasha Levin [Sat, 14 Apr 2012 16:37:47 +0000 (12:37 -0400)]
ipvs: use GFP_KERNEL allocation where possible

Use GFP_KERNEL instead of GFP_ATOMIC when registering an ipvs protocol.

This is safe since it will always run from a process context.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agoipvs: SH scheduler does not need GFP_ATOMIC allocation
Julian Anastasov [Fri, 13 Apr 2012 13:49:38 +0000 (16:49 +0300)]
ipvs: SH scheduler does not need GFP_ATOMIC allocation

        Schedulers are initialized and bound to services only
on commands.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
12 years agoipvs: LBLCR scheduler does not need GFP_ATOMIC allocation on init
Julian Anastasov [Fri, 13 Apr 2012 13:49:41 +0000 (16:49 +0300)]
ipvs: LBLCR scheduler does not need GFP_ATOMIC allocation on init

Schedulers are initialized and bound to services only
on commands.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
12 years agoipvs: WRR scheduler does not need GFP_ATOMIC allocation
Julian Anastasov [Fri, 13 Apr 2012 13:49:42 +0000 (16:49 +0300)]
ipvs: WRR scheduler does not need GFP_ATOMIC allocation

Schedulers are initialized and bound to services only
on commands.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
12 years agoipvs: DH scheduler does not need GFP_ATOMIC allocation
Julian Anastasov [Fri, 13 Apr 2012 13:49:39 +0000 (16:49 +0300)]
ipvs: DH scheduler does not need GFP_ATOMIC allocation

Schedulers are initialized and bound to services only
on commands.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
12 years agoipvs: LBLC scheduler does not need GFP_ATOMIC allocation on init
Julian Anastasov [Fri, 13 Apr 2012 13:49:40 +0000 (16:49 +0300)]
ipvs: LBLC scheduler does not need GFP_ATOMIC allocation on init

Schedulers are initialized and bound to services only
on commands.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
12 years agoipvs: timeout tables do not need GFP_ATOMIC allocation
Julian Anastasov [Fri, 13 Apr 2012 13:49:37 +0000 (16:49 +0300)]
ipvs: timeout tables do not need GFP_ATOMIC allocation

They are called only on initialization.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
12 years agonetfilter: bridge: optionally set indev to vlan
Pablo Neira Ayuso [Tue, 8 May 2012 17:36:44 +0000 (19:36 +0200)]
netfilter: bridge: optionally set indev to vlan

if net.bridge.bridge-nf-filter-vlan-tagged sysctl is enabled, bridge
netfilter removes the vlan header temporarily and then feeds the packet
to ip(6)tables.

When the new "bridge-nf-pass-vlan-input-device" sysctl is on
(default off), then bridge netfilter will also set the
in-interface to the vlan interface; if such an interface exists.

This is needed to make iptables REDIRECT target work with
"vlan-on-top-of-bridge" setups and to allow use of "iptables -i" to
match the vlan device name.

Also update Documentation with current brnf default settings.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonetfilter: nf_conntrack: use this_cpu_inc()
Eric Dumazet [Wed, 18 Apr 2012 04:36:40 +0000 (06:36 +0200)]
netfilter: nf_conntrack: use this_cpu_inc()

this_cpu_inc() is IRQ safe and faster than
local_bh_disable()/__this_cpu_inc()/local_bh_enable(), at least on x86.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Christoph Lameter <cl@linux.com>
Cc: Tejun Heo <tj@kernel.org>
Reviewed-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonetfilter: nf_ct_helper: allow to disable automatic helper assignment
Eric Leblond [Wed, 18 Apr 2012 09:20:41 +0000 (11:20 +0200)]
netfilter: nf_ct_helper: allow to disable automatic helper assignment

This patch allows you to disable automatic conntrack helper
lookup based on TCP/UDP ports, eg.

echo 0 > /proc/sys/net/netfilter/nf_conntrack_helper

[ Note: flows that already got a helper will keep using it even
  if automatic helper assignment has been disabled ]

Once this behaviour has been disabled, you have to explicitly
use the iptables CT target to attach helper to flows.

There are good reasons to stop supporting automatic helper
assignment, for further information, please read:

http://www.netfilter.org/news.html#2012-04-03

This patch also adds one message to inform that automatic helper
assignment is deprecated and it will be removed soon (this is
spotted only once, with the first flow that gets a helper attached
to make it as less annoying as possible).

Signed-off-by: Eric Leblond <eric@regit.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agonetfilter: nf_ct_ecache: refactor notifier registration
Tony Zelenoff [Thu, 8 Mar 2012 23:35:39 +0000 (23:35 +0000)]
netfilter: nf_ct_ecache: refactor notifier registration

* ret variable initialization removed as useless
* similar code strings concatenated and functions code
  flow became more plain

Signed-off-by: Tony Zelenoff <antonz@parallels.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agoetherdev.h: Convert int is_<foo>_ether_addr to bool
Joe Perches [Tue, 8 May 2012 06:44:40 +0000 (06:44 +0000)]
etherdev.h: Convert int is_<foo>_ether_addr to bool

Make the return value explicitly true or false.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agosmsc75xx: let EEPROM determine GPIO/LED settings
Steve Glendinning [Fri, 4 May 2012 00:57:13 +0000 (00:57 +0000)]
smsc75xx: let EEPROM determine GPIO/LED settings

This patch allows the GPIO/LED settings to be configured by the
EEPROM if present, and only sets the default values (LED outputs
for link/activity) when an EEPROM is not detected.

Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agosmsc75xx: eliminate unnecessary phy register read
Steve Glendinning [Fri, 4 May 2012 00:57:12 +0000 (00:57 +0000)]
smsc75xx: eliminate unnecessary phy register read

Only a write is necessary to clear the interrupt status, and we
don't use the value from the preceding read operation.  This
patch eliminates the unnecessary read.

Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agosmsc75xx: replace 0xffff with PHY_INT_SRC_CLEAR_ALL
Steve Glendinning [Fri, 4 May 2012 00:57:11 +0000 (00:57 +0000)]
smsc75xx: replace 0xffff with PHY_INT_SRC_CLEAR_ALL

This patch defines PHY_INT_SRC_CLEAR_ALL to replace the value 0xffff
in order to be more self-documenting.

This patch should make no functional change, it is purely cosmetic.

Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Tue, 8 May 2012 03:35:40 +0000 (23:35 -0400)]
Merge git://git./linux/kernel/git/davem/net

Conflicts:
drivers/net/ethernet/intel/e1000e/param.c
drivers/net/wireless/iwlwifi/iwl-agn-rx.c
drivers/net/wireless/iwlwifi/iwl-trans-pcie-rx.c
drivers/net/wireless/iwlwifi/iwl-trans.h

Resolved the iwlwifi conflict with mainline using 3-way diff posted
by John Linville and Stephen Rothwell.  In 'net' we added a bug
fix to make iwlwifi report a more accurate skb->truesize but this
conflicted with RX path changes that happened meanwhile in net-next.

In e1000e a conflict arose in the validation code for settings of
adapter->itr.  'net-next' had more sophisticated logic so that
logic was used.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'vhost-net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mst...
David S. Miller [Tue, 8 May 2012 03:05:13 +0000 (23:05 -0400)]
Merge branch 'vhost-net-next' of git://git./linux/kernel/git/mst/vhost

Michael S. Tsirkin says:

--------------------
There are mostly bugfixes here.
I hope to merge some more patches by 3.5, in particular
vlan support fixes are waiting for Eric's ack,
and a version of tracepoint patch might be
ready in time, but let's merge what's ready so it's testable.

This includes a ton of zerocopy fixes by Jason -
good stuff but too intrusive for 3.4 and zerocopy is experimental
anyway.

virtio supported delayed interrupt for a while now
so adding support to the virtio tool made sense
--------------------

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: IP_MULTICAST_IF setsockopt now recognizes struct mreq
Jiri Pirko [Thu, 3 May 2012 22:37:45 +0000 (22:37 +0000)]
net: IP_MULTICAST_IF setsockopt now recognizes struct mreq

Until now, struct mreq has not been recognized and it was worked with
as with struct in_addr. That means imr_multiaddr was copied to
imr_address. So do recognize struct mreq here and copy that correctly.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonetdev/of/phy: Add MDIO bus multiplexer driven by GPIO lines.
David Daney [Wed, 2 May 2012 15:16:39 +0000 (15:16 +0000)]
netdev/of/phy: Add MDIO bus multiplexer driven by GPIO lines.

The GPIO pins select which sub bus is connected to the master.

Initially tested with an sn74cbtlv3253 switch device wired into the
MDIO bus.

Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonetdev/of/phy: Add MDIO bus multiplexer support.
David Daney [Wed, 2 May 2012 15:16:38 +0000 (15:16 +0000)]
netdev/of/phy: Add MDIO bus multiplexer support.

This patch adds a somewhat generic framework for MDIO bus
multiplexers.  It is modeled on the I2C multiplexer.

The multiplexer is needed if there are multiple PHYs with the same
address connected to the same MDIO bus adepter, or if there is
insufficient electrical drive capability for all the connected PHY
devices.

Conceptually it could look something like this:

                   ------------------
                   | Control Signal |
                   --------+---------
                           |
 ---------------   --------+------
 | MDIO MASTER |---| Multiplexer |
 ---------------   --+-------+----
                     |       |
                     C       C
                     h       h
                     i       i
                     l       l
                     d       d
                     |       |
     ---------       A       B   ---------
     |       |       |       |   |       |
     | PHY@1 +-------+       +---+ PHY@1 |
     |       |       |       |   |       |
     ---------       |       |   ---------
     ---------       |       |   ---------
     |       |       |       |   |       |
     | PHY@2 +-------+       +---+ PHY@2 |
     |       |                   |       |
     ---------                   ---------

This framework configures the bus topology from device tree data.  The
mechanics of switching the multiplexer is left to device specific
drivers.

The follow-on patch contains a multiplexer driven by GPIO lines.

Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonetdev/of/phy: New function: of_mdio_find_bus().
David Daney [Wed, 2 May 2012 15:16:37 +0000 (15:16 +0000)]
netdev/of/phy: New function: of_mdio_find_bus().

Add of_mdio_find_bus() which allows an mii_bus to be located given its
associated the device tree node.

This is needed by the follow-on patch to add a driver for MDIO bus
multiplexers.

The of_mdiobus_register() function is modified so that the device tree
node is recorded in the mii_bus.  Then we can find it again by
iterating over all mdio_bus_class devices.

Because the OF device tree has now become an integral part of the
kernel, this can live in mdio_bus.c (which contains the needed
mdio_bus_class structure) instead of of_mdio.c.

Signed-off-by: David Daney <david.daney@cavium.com>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoisdn/capi: elliminate capincci_find() in non-middleware case
Tilman Schmidt [Wed, 25 Apr 2012 13:02:20 +0000 (13:02 +0000)]
isdn/capi: elliminate capincci_find() in non-middleware case

If Kernel CAPI is compiled without CONFIG_ISDN_CAPI_MIDDLEWARE,
the structure retrieved via capincci_find() is never actually
used, so don't compile that function in that case.

Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoisdn/capi: fix readability damage
Tilman Schmidt [Wed, 25 Apr 2012 13:02:20 +0000 (13:02 +0000)]
isdn/capi: fix readability damage

Fix up some of the readibility deterioration caused by the recent
whitespace coding style cleanup.

Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoisdn/gigaset: unify function return values
Tilman Schmidt [Wed, 25 Apr 2012 13:02:20 +0000 (13:02 +0000)]
isdn/gigaset: unify function return values

Various functions in the Gigaset driver were using different
conventions for the meaning of their int return values.
Align them to the usual negative error numbers convention.

Inspired-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoisdn/gigaset: internal function name cleanup
Tilman Schmidt [Wed, 25 Apr 2012 13:02:20 +0000 (13:02 +0000)]
isdn/gigaset: internal function name cleanup

Functions clear_at_state and free_strings did the same thing;
drop one of them, keeping the more descriptive name.
Drop a redundant call.
Rename function dealloc_at_states to dealloc_temp_at_states
to clarify its purpose.

Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoisdn/gigaset: fix readability damage
Tilman Schmidt [Wed, 25 Apr 2012 13:02:20 +0000 (13:02 +0000)]
isdn/gigaset: fix readability damage

Fix up some of the readibility deterioration caused by the recent
whitespace coding style cleanup.

Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoisdn/gigaset: improve error handling querying firmware version
Tilman Schmidt [Wed, 25 Apr 2012 13:02:20 +0000 (13:02 +0000)]
isdn/gigaset: improve error handling querying firmware version

An out-of-place "OK" response to the "AT+GMR" (get firmware version)
command turns out to be, more often than not, a delayed response to
a previous command rather than an actual error, so continue waiting
for the version number in that case.

Signed-off-by: Tilman Schmidt <tilman@imap.cc>
CC: stable <stable@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoisdn/gigaset: fix CAPI disconnect B3 handling
Tilman Schmidt [Wed, 25 Apr 2012 13:02:20 +0000 (13:02 +0000)]
isdn/gigaset: fix CAPI disconnect B3 handling

If DISCONNECT_B3_IND was synthesized because of a DISCONNECT_REQ
with existing logical connections, the connection state wasn't
updated accordingly. Also the emitted DISCONNECT_B3_IND message
wasn't included in the debug log as requested.
This patch fixes both of these issues.

Signed-off-by: Tilman Schmidt <tilman@imap.cc>
CC: stable <stable@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoisdn/gigaset: ratelimit CAPI message dumps
Tilman Schmidt [Wed, 25 Apr 2012 13:02:19 +0000 (13:02 +0000)]
isdn/gigaset: ratelimit CAPI message dumps

Introduce a global ratelimit for CAPI message dumps to protect
against possible log flood.
Drop the ratelimit for ignored messages which is now covered by the
global one.

Signed-off-by: Tilman Schmidt <tilman@imap.cc>
CC: stable <stable@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: compare_ether_addr[_64bits]() has no ordering
Johannes Berg [Mon, 7 May 2012 13:39:06 +0000 (15:39 +0200)]
net: compare_ether_addr[_64bits]() has no ordering

Neither compare_ether_addr() nor compare_ether_addr_64bits()
(as it can fall back to the former) have comparison semantics
like memcmp() where the sign of the return value indicates sort
order. We had a bug in the wireless code due to a blind memcmp
replacement because of this.

A cursory look suggests that the wireless bug was the only one
due to this semantic difference.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net...
David S. Miller [Mon, 7 May 2012 15:47:51 +0000 (11:47 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next

12 years agoskb: Add inline helper for getting the skb end offset from head
Alexander Duyck [Fri, 4 May 2012 14:26:56 +0000 (14:26 +0000)]
skb: Add inline helper for getting the skb end offset from head

With the recent changes for how we compute the skb truesize it occurs to me
we are probably going to have a lot of calls to skb_end_pointer -
skb->head.  Instead of running all over the place doing that it would make
more sense to just make it a separate inline skb_end_offset(skb) that way
we can return the correct value without having gcc having to do all the
optimization to cancel out skb->head - skb->head.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoskb: Drop "fastpath" variable for skb_cloned check in pskb_expand_head
Alexander Duyck [Fri, 4 May 2012 14:26:51 +0000 (14:26 +0000)]
skb: Drop "fastpath" variable for skb_cloned check in pskb_expand_head

Since there is now only one spot that actually uses "fastpath" there isn't
much point in carrying it.  Instead we can just use a check for skb_cloned
to verify if we can perform the fast-path free for the head or not.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoskb: Drop bad code from pskb_expand_head
Alexander Duyck [Fri, 4 May 2012 14:26:46 +0000 (14:26 +0000)]
skb: Drop bad code from pskb_expand_head

The fast-path for pskb_expand_head contains a check where the size plus the
unaligned size of skb_shared_info is compared against the size of the data
buffer.  This code path has two issues.  First is the fact that after the
recent changes by Eric Dumazet to __alloc_skb and build_skb the shared info
is always placed in the optimal spot for a buffer size making this check
unnecessary.  The second issue is the fact that the check doesn't take into
account the aligned size of shared info.  As a result the code burns cycles
doing a memcpy with nothing actually being shifted.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agocdc_ether: Ignore bogus union descriptor for RNDIS devices
Bjørn Mork [Thu, 26 Apr 2012 02:35:10 +0000 (02:35 +0000)]
cdc_ether: Ignore bogus union descriptor for RNDIS devices

Some RNDIS devices include a bogus CDC Union descriptor pointing
to non-existing interfaces.  The RNDIS code is already prepared
to handle devices without a CDC Union descriptor by hardwiring
the driver to use interfaces 0 and 1, which is correct for the
devices with the bogus descriptor as well. So we can reuse the
existing workaround.

Cc: Markus Kolb <linux-201011@tower-net.de>
Cc: Iker Salmón San Millán <shaola@esdebian.org>
Cc: Jonathan Nieder <jrnieder@gmail.com>
Cc: Oliver Neukum <oliver@neukum.org>
Cc: 655387@bugs.debian.org
Cc: stable@vger.kernel.org
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: bug fix when loading after SAN boot
Ariel Elior [Sun, 6 May 2012 07:05:57 +0000 (07:05 +0000)]
bnx2x: bug fix when loading after SAN boot

This is a bug fix for an "interface fails to load" issue.
The issue occurs when bnx2x driver loads after UNDI driver was previously
loaded over the chip. In such a scenario the UNDI driver is loaded and operates
in the pre-boot kernel, within its own specific host memory address range.
When the pre-boot stage is complete, the real kernel is loaded, in a new and
distinct host memory address range. The transition from pre-boot stage to boot
is asynchronous from UNDI point of view.

A race condition occurs when UNDI driver triggers a DMAE transaction to valid
host addresses in the pre-boot stage, when control is diverted to the real
kernel. This results in access to illegal addresses by our HW as the addresses
which were valid in the preboot stage are no longer considered valid.
Specifically, the 'was_error' bit in the pci glue of our device is set. This
causes all following pci transactions from chip to host to timeout (in
accordance to the pci spec).

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoixgbe: dcb: IEEE PFC stats and reset logic incorrect
John Fastabend [Mon, 23 Apr 2012 22:27:28 +0000 (22:27 +0000)]
ixgbe: dcb: IEEE PFC stats and reset logic incorrect

PFC stats are only tabulated when PFC is enabled. However in IEEE
mode the ieee_pfc pfc_tc bits were not checked and the calculation
was aborted.

This results in statistics not being reported through ethtool and
possible a false Tx hang occurring when receiving pause frames.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: increase version number
Bruce Allan [Fri, 4 May 2012 08:52:03 +0000 (08:52 +0000)]
e1000e: increase version number

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: clear REQ and GNT in EECD (82571 && 82572)
Richard Alpe [Fri, 20 Apr 2012 15:24:50 +0000 (15:24 +0000)]
e1000e: clear REQ and GNT in EECD (82571 && 82572)

Clear the REQ and GNT bit in the eeprom control register (EECD).
This is required if the eeprom is to be accessed with auto read
EERD register.

After a cold reset this doesn't matter but if PBIST MAC test was
executed before booting, the register was left in a dirty state
(the 2 bits where set), which caused the read operation to time out
and returning 0.

Reference (page 312):
http://download.intel.com/design/network/manuals/316080.pdf

Reported-by: Aleksandar Igic <aleksandar.igic@dektech.com.au>
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: enable forced master/slave on 82577
Bruce Allan [Tue, 20 Mar 2012 03:47:41 +0000 (03:47 +0000)]
e1000e: enable forced master/slave on 82577

Like other supported (igp) PHYs, the driver needs to be able to force the
master/slave mode on 82577.  Since the code is the same as what already
exists in the code flow for igp PHYs, move it to a new function to be
called for both flows.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoMerge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville...
David S. Miller [Fri, 4 May 2012 16:07:15 +0000 (12:07 -0400)]
Merge branch 'for-davem' of git://git./linux/kernel/git/linville/wireless

12 years agotcp: be more strict before accepting ECN negociation
Eric Dumazet [Fri, 4 May 2012 05:14:02 +0000 (05:14 +0000)]
tcp: be more strict before accepting ECN negociation

It appears some networks play bad games with the two bits reserved for
ECN. This can trigger false congestion notifications and very slow
transferts.

Since RFC 3168 (6.1.1) forbids SYN packets to carry CT bits, we can
disable TCP ECN negociation if it happens we receive mangled CT bits in
the SYN packet.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Perry Lorier <perryl@google.com>
Cc: Matt Mathis <mattmathis@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Wilmer van der Gaast <wilmer@google.com>
Cc: Ankur Jain <jankur@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Dave Täht <dave.taht@bufferbloat.net>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agomISDN: Help to identify the card
Karsten Keil [Fri, 4 May 2012 04:15:35 +0000 (04:15 +0000)]
mISDN: Help to identify the card

With multiple cards is hard to figure out which port caused trouble
int the layer2 routines (e.g. got a timeout).
Now we have the informations in the log output.

Signed-off-by: Karsten Keil <kkeil@linux-pingi.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agomISDN: Layer1 statemachine fix
Karsten Keil [Fri, 4 May 2012 04:15:34 +0000 (04:15 +0000)]
mISDN: Layer1 statemachine fix

The timer3 and the activation delay timer need to be independent.
If timer3 fires do not reqest power up we have to send only INFO 0.
Now layer1 pass TBR3 again.

Signed-off-by: Karsten Keil <kkeil@linux-pingi.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agomISDN: Make layer1 timer 3 value configurable
Karsten Keil [Fri, 4 May 2012 04:15:33 +0000 (04:15 +0000)]
mISDN: Make layer1 timer 3 value configurable

For certification test it is very useful to change the layer1
timer3 value on runtime.

Signed-off-by: Karsten Keil <kkeil@linux-pingi.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agomISDN: L2 timeouts need to be queued as L2 event
Karsten Keil [Fri, 4 May 2012 04:15:32 +0000 (04:15 +0000)]
mISDN: L2 timeouts need to be queued as L2 event

To be full preemptiv safe, we cannot handle a L2 timeout in the timer
context itself, we should do all actions via the D-channel thread.

Signed-off-by: Karsten Keil <kkeil@linux-pingi.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agomISDN: Fix refcounting bug
Karsten Keil [Fri, 4 May 2012 04:15:31 +0000 (04:15 +0000)]
mISDN: Fix refcounting bug

Under some configs it was still not possible to unload the driver,
because the module use count was srewed up.

Signed-off-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agomISDN: Added PH_* state info to tei manager.
Andreas Eversberg [Fri, 4 May 2012 04:15:30 +0000 (04:15 +0000)]
mISDN: Added PH_* state info to tei manager.

Tei manager reports current layer 1 state on creation.
On state change it reports it to the socket interface.

Signed-off-by: Andreas Eversberg <andreas@eversberg.eu>
Signed-off-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: sched: factorize code (qdisc_drop())
Eric Dumazet [Fri, 4 May 2012 04:37:21 +0000 (04:37 +0000)]
net: sched: factorize code (qdisc_drop())

Use qdisc_drop() helper where possible.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net
David S. Miller [Fri, 4 May 2012 14:55:50 +0000 (10:55 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net

12 years agoe1000: Silence sparse warnings by correcting type
Andrei Emeltchenko [Sun, 25 Mar 2012 17:49:25 +0000 (17:49 +0000)]
e1000: Silence sparse warnings by correcting type

Silence sparse warnings shown below:
...
drivers/net/ethernet/intel/e1000/e1000_main.c:3435:17: warning:
cast to restricted __le64
drivers/net/ethernet/intel/e1000/e1000_main.c:3435:17: warning:
cast to restricted __le64
...

Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoigb, ixgbe: netdev_tx_reset_queue incorrectly called from tx init path
John Fastabend [Mon, 23 Apr 2012 12:22:39 +0000 (12:22 +0000)]
igb, ixgbe: netdev_tx_reset_queue incorrectly called from tx init path

igb and ixgbe incorrectly call netdev_tx_reset_queue() from
i{gb|xgbe}_clean_tx_ring() this sort of works in most cases except
when the number of real tx queues changes. When the number of real
tx queues changes netdev_tx_reset_queue() only gets called on the
new number of queues so when we reduce the number of queues we risk
triggering the watchdog timer and repeated device resets.

So this is not only a cosmetic issue but causes real bugs. For
example enabling/disabling DCB or FCoE in ixgbe will trigger this.

CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: John Bishop <johnx.bishop@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoixgbe: Update link flow control to correctly handle multiple packet buffer DCB
Alexander Duyck [Thu, 19 Apr 2012 17:48:48 +0000 (17:48 +0000)]
ixgbe: Update link flow control to correctly handle multiple packet buffer DCB

This change updates the link flow control configuration so that we
correctly set the link flow control settings for DCB.  Previously we would
have to call the fc_enable call 8 times, once for each packet buffer.  If
we move that logic into the fc_enable call itself we can avoid multiple
unnecessary register writes.

This change also corrects an issue in which we were only shifting the water
marks for 82599 parts by 6 instead of 10.  This was resulting in us only
using 1/16 of the packet buffer when flow control was enabled.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoixgbe: Reorder link flow control functions in ixgbe_common.c
Alexander Duyck [Thu, 19 Apr 2012 17:49:56 +0000 (17:49 +0000)]
ixgbe: Reorder link flow control functions in ixgbe_common.c

We can avoid many of the forward declarations found in ixgbe_common.c by
just reordering things so this patch does that to help cleanup the code.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoixgbe: Use __free_pages instead of put_page to release pages
Alexander Duyck [Fri, 6 Apr 2012 04:24:50 +0000 (04:24 +0000)]
ixgbe: Use __free_pages instead of put_page to release pages

This change replaces the calls to put_page with calls to __free_page.

Since the FCoE code is able to access order 1 pages I thought it would be a
good idea to change things over to using __free_pages since that is the
preferred approach for freeing pages.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoixgbe: Make ixgbe_fc_autoneg return void and always set current_mode
Alexander Duyck [Wed, 28 Mar 2012 08:03:48 +0000 (08:03 +0000)]
ixgbe: Make ixgbe_fc_autoneg return void and always set current_mode

This change makes it so that ixgbe_fc_autoneg is a void and always sets the
current_mode.  Previously if the link was down we would return an error,
however there is no harm in simply treating a link down case as a case in
which autoneg simply failed.  This allows us to rely on the return value of
the ixgbe_fc_enable call now since there should be no cases where it
returns an error that would normally be ignored.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoixgbe: Reorder the ring to q_vector mapping to improve performance
Alexander Duyck [Wed, 28 Mar 2012 08:03:43 +0000 (08:03 +0000)]
ixgbe: Reorder the ring to q_vector mapping to improve performance

This change reorders the mapping of rings to q_vectors in the case that the
number of rings exceeds the number of q_vectors.  Previously we would
allocate the first R/N queues to the first q_vector where R is the number
of rings and N is the number of q_vectors.  Instead of doing this we can do
a better job of interleaving the rings to the CPUs by assigning every Nth
ring to the q_vector.

The below tables illustrate this change for the R = 16 N = 4 case.
          Before patch  After patch
q_vector:  0  1  2  3    0  1  2  3
Rings:     0  4  8 12    0  1  2  3
           1  5  9 13    4  5  6  7
           3  6 10 14    8  9 10 11
           4  7 11 15   12 13 14 15

This should improve the performance for both DCB or ATR when the number of
rings exceeds the number of q_vectors allocated by the adapter.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoixgbe: Track instances of buffer available but no DMA resources present
Alexander Duyck [Wed, 28 Mar 2012 08:03:32 +0000 (08:03 +0000)]
ixgbe: Track instances of buffer available but no DMA resources present

This change makes it so that we can track instances of where a packet was
dropped due to a packet being received when there are no DMA buffers
available in the ring.

For some reason this was only being enabled with RSC, however it makes
more sense to always have this feature on so that we can track any cases
where we might drop a buffer due to an Rx ring being full.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: initial support for i217
Bruce Allan [Thu, 19 Apr 2012 03:21:47 +0000 (03:21 +0000)]
e1000e: initial support for i217

i217 is the next-generation LOM that will be available on systems with the
Lynx Point Platform Controller Hub (PCH) chipset from Intel.  This patch
provides the initial support for the device.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoe1000e: Update driver version number
Matthew Vick [Wed, 25 Apr 2012 04:45:57 +0000 (04:45 +0000)]
e1000e: Update driver version number

Version bump to 1.11.3-k.

Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
12 years agoMerge tag 'mfd-for-linus-3.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 4 May 2012 00:21:05 +0000 (17:21 -0700)]
Merge tag 'mfd-for-linus-3.4-rc6' of git://git./linux/kernel/git/sameo/mfd-2.6

Pull second set of MFD fixes from Samuel Ortiz:
 "This time we only have a one liner fixing an omap-usb build error."

* tag 'mfd-for-linus-3.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6:
  mfd: Fix build breakage in omap-usb-host.c

12 years agoMerge branch 'efi-vars' from Matthew Garrett
Linus Torvalds [Fri, 4 May 2012 00:19:48 +0000 (17:19 -0700)]
Merge branch 'efi-vars' from Matthew Garrett

* efi-vars:
  efivars: Improve variable validation

12 years agoefivars: Improve variable validation
Matthew Garrett [Thu, 3 May 2012 20:50:46 +0000 (16:50 -0400)]
efivars: Improve variable validation

Ben Hutchings pointed out that the validation in efivars was inadequate -
most obviously, an entry with size 0 would server as a DoS against the
kernel. Improve this based on his suggestions.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMerge tag 'tag/upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarz...
Linus Torvalds [Fri, 4 May 2012 00:16:52 +0000 (17:16 -0700)]
Merge tag 'tag/upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev

Pull libata fixes from Jeff Garzik:

1) Fix regression that could cause a misdiagnosis, which in turn could
   lead to an erroneous 3.0 Gbps -> 1.5 downshift, particularly when hotplug
   and suspend/resume is involved.

2) Fix a regression that led to ata%d controller ids being numbered one
   larger than in <= 3.4-rc3 (oh, the horror!).  Controller ids should now be
   as expected.

3) add some DT, PCI id's

4) ata/pata_arasan_cf: minor cpp fixing/cleaning

* tag 'tag/upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  ata: ahci_platform: Add synopsys ahci controller in DT's compatible list
  ata/pata_arasan_cf: Move arasan_cf_pm_ops out of #ifdef, #endif macros
  libata: init ata_print_id to 0
  ahci: Detect Marvell 88SE9172 SATA controller
  libata: skip old error history when counting probe trials

12 years agoMerge branch 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux
Linus Torvalds [Fri, 4 May 2012 00:15:47 +0000 (17:15 -0700)]
Merge branch 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux

Pull i2c embedded fixes from Wolfram Sang:
 "Here are some typical i2c driver bugfixes for 3.4.  Missed clock
  handling, improper timeout fixes, hardware wrokarounds...  All
  patches have been in linux-next for a few days, too."

* 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux:
  i2c: mxs: disable QUEUE when sending is done
  i2c: mxs: handle spurious interrupt
  i2c-eg20t: Modify MODULE_AUTHOR's email address
  i2c-eg20t: change timeout value 50msec to 1000msec
  i2c: tegra: Add delay before resetting the controller after NACK
  i2c: pnx: Disable clk in suspend

12 years agoMerge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Fri, 4 May 2012 00:14:55 +0000 (17:14 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "Just some regression fixes from Ben along with a variable that gcc
  failed to spot is uninitialised."

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  nouveau: initialise has_optimus variable.
  drm/nv10/gpio: fix thinko in mask for gpio lines 2-9
  nvc0/fb: shut up PMFB interrupt after the first occurrence
  drm/nouveau/hdmi: use correct hdmi regs for nvaa/nvac
  drm/nouveau/bios: fix regression on some nv4x board

12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Fri, 4 May 2012 00:10:39 +0000 (17:10 -0700)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Transfer padding was wrong for full-speed USB in ASIX driver, fix
    from Ingo van Lil.

 2) Propagate the negative packet offset fix into the PowerPC BPF JIT.
    From Jan Seiffert.

 3) dl2k driver's private ioctls were letting unprivileged tasks make
    MII writes and other ugly bits like that.  Fix from Jeff Mahoney.

 4) Fix TX VLAN and RX packet drops in ucc_geth, from Joakim Tjernlund.

 5) OOPS and network namespace fixes in IPVS from Hans Schillstrom and
    Julian Anastasov.

 6) Fix races and sleeping in locked context bugs in drop_monitor, from
    Neil Horman.

 7) Fix link status indication in smsc95xx driver, from Paolo Pisati.

 8) Fix bridge netfilter OOPS, from Peter Huang.

 9) L2TP sendmsg can return on error conditions with the socket lock
    held, oops.  Fix from Sasha Levin.

10) udp_diag should return meaningful values for socket memory usage,
    from Shan Wei.

11) Eric Dumazet is so awesome he gets his own section:

       Socket memory cgroup code (I never should have applied those
       patches, grumble...) made erroneous changes to
       sk_sockets_allocated_read_positive().  It was changed to
       use percpu_counter_sum_positive (which requires BH disabling)
       instead of percpu_counter_read_positive (which does not).
       Revert back to avoid crashes and lockdep warnings.

       Adjust the default tcp_adv_win_scale and tcp_rmem[2] values
       to fix throughput regressions.  This is necessary as a result
       of our more precise skb->truesize tracking.

       Fix SKB leak in netem packet scheduler.

12) New device IDs for various bluetooth devices, from Manoj Iyer,
    AceLan Kao, and Steven Harms.

13) Fix command completion race in ipw2200, from Stanislav Yakovlev.

14) Fix rtlwifi oops on unload, from Larry Finger.

15) Fix hard_mtu when adjusting hard_header_len in smsc95xx driver.
    From Stephane Fillod.

16) ehea driver registers it's IRQ before all the necessary state is
    setup, resulting in crashes.  Fix from Thadeu Lima de Souza
    Cascardo.

17) Fix PHY connection failures in davinci_emac driver, from Anatolij
    Gustschin.

18) Missing break; in switch statement in bluetooth's
    hci_cmd_complete_evt().  Fix from Szymon Janc.

19) Fix queue programming in iwlwifi, from Johannes Berg.

20) Interrupt throttling defaults not being actually programmed into the
    hardware, fix from Jeff Kirsher and Ying Cai.

21) TLAN driver SKB encoding in descriptor busted on 64-bit, fix from
    Benjamin Poirier.

22) Fix blind status block RX producer pointer deref in TG3 driver, from
    Matt Carlson.

23) Promisc and multicast are busted on ehea, fixes from Thadeu Lima de
    Souza Cascardo.

24) Fix crashes in 6lowpan, from Alexander Smirnov.

25) tcp_complete_cwr() needs to be careful to not rewind the CWND to
    ssthresh if ssthresh has the "infinite" value.  Fix from Yuchung
    Cheng.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (81 commits)
  sungem: Fix WakeOnLan
  tcp: change tcp_adv_win_scale and tcp_rmem[2]
  net: l2tp: unlock socket lock before returning from l2tp_ip_sendmsg
  drop_monitor: prevent init path from scheduling on the wrong cpu
  usbnet: fix failure handling in usbnet_probe
  usbnet: fix leak of transfer buffer of dev->interrupt
  ucc_geth: Add 16 bytes to max TX frame for VLANs
  net: ucc_geth, increase no. of HW RX descriptors
  netem: fix possible skb leak
  sky2: fix receive length error in mixed non-VLAN/VLAN traffic
  sky2: propogate rx hash when packet is copied
  net: fix two typos in skbuff.h
  cxgb3: Don't call cxgb_vlan_mode until q locks are initialized
  ixgbe: fix calling skb_put on nonlinear skb assertion bug
  ixgbe: Fix a memory leak in IEEE DCB
  igbvf: fix the bug when initializing the igbvf
  smsc75xx: enable mac to detect speed/duplex from phy
  smsc75xx: declare smsc75xx's MII as GMII capable
  smsc75xx: fix phy interrupt acknowledge
  smsc75xx: fix phy init reset loop
  ...

12 years agoMerge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck...
Linus Torvalds [Fri, 4 May 2012 00:08:58 +0000 (17:08 -0700)]
Merge tag 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:
 "Fix OOPS seen in coretemp driver if the CPU core ID is too large"

* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (coretemp) Increase CPU core limit
  hwmon: (coretemp) fix oops on cpu unplug

12 years agonet/niu: remove one superfluous dma mask check
Sebastian Andrzej Siewior [Thu, 3 May 2012 18:22:00 +0000 (20:22 +0200)]
net/niu: remove one superfluous dma mask check

The idea here seems to be to get a 44bit DMA mask working and if this
fails it should fallback to a 32bit DMA mask. The dma_mask variable is
assigned once to 44bit and never updated. pci_set_dma_mask() and
pci_set_consistent_dma_mask() are both implemented as functions so there
is no evil macro which might update dma_mask. Looking at the assembly, I
see a call to dma_set_mask() followed by dma_supported() and then a jump
passed the second dma_set_mask(). The only way to get to second
dma_set_mask() call is by an error code in the first one.

So I hereby remove the check since it looks superfluous. Please ignore
the path if there is black magic involved.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoata: ahci_platform: Add synopsys ahci controller in DT's compatible list
Viresh Kumar [Sat, 21 Apr 2012 12:10:12 +0000 (17:40 +0530)]
ata: ahci_platform: Add synopsys ahci controller in DT's compatible list

SPEAr13xx series of SoCs contain Synopsys AHCI SATA Controller which shares
ahci_platform driver with other controller versions.

This patch updates DT compatible list for ahci_platform. It also updates and
renames binding documentation to more generic name.

Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Cc: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
12 years agoata/pata_arasan_cf: Move arasan_cf_pm_ops out of #ifdef, #endif macros
Viresh Kumar [Sat, 21 Apr 2012 12:10:09 +0000 (17:40 +0530)]
ata/pata_arasan_cf: Move arasan_cf_pm_ops out of #ifdef, #endif macros

#ifdef, #endif is not required in definition/usage of arasan_cf_pm_ops. So, move
this definition and its usage outside of them.

Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
12 years agolibata: init ata_print_id to 0
Tero Roponen [Sun, 22 Apr 2012 08:38:00 +0000 (11:38 +0300)]
libata: init ata_print_id to 0

When comparing the dmesg between 3.4-rc3 and 3.4-rc4 I found the
following differences:

 -ata1: SATA max UDMA/133 abar m2048@0xf9fff000 port 0xf9fff100 irq 47
 -ata2: SATA max UDMA/133 abar m2048@0xf9fff000 port 0xf9fff180 irq 47
 -ata3: DUMMY
 +ata2: SATA max UDMA/133 abar m2048@0xf9fff000 port 0xf9fff100 irq 47
 +ata3: SATA max UDMA/133 abar m2048@0xf9fff000 port 0xf9fff180 irq 47
  ata4: DUMMY
  ata5: DUMMY
 -ata6: SATA max UDMA/133 abar m2048@0xf9fff000 port 0xf9fff380 irq 47
 +ata6: DUMMY
 +ata7: SATA max UDMA/133 abar m2048@0xf9fff000 port 0xf9fff380 irq 47

The change of numbering comes from commit 85d6725b7c0d7e3f ("libata:
make ata_print_id atomic") that changed lines like

ap->print_id = ata_print_id++;
to
ap->print_id = atomic_inc_return(&ata_print_id);

As the latter behaves like ++ata_print_id, we must initialize
it to zero to start the numbering from one.

Signed-off-by: Tero Roponen <tero.roponen@gmail.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
12 years agoahci: Detect Marvell 88SE9172 SATA controller
Matt Johnson [Fri, 27 Apr 2012 06:42:30 +0000 (01:42 -0500)]
ahci: Detect Marvell 88SE9172 SATA controller

The Marvell 88SE9172 SATA controller (PCI ID 1b4b 917a) already worked
once it was detected, but was missing an ahci_pci_tbl entry.

Boot tested on a Gigabyte Z68X-UD3H-B3 motherboard.

Signed-off-by: Matt Johnson <johnso87@illinois.edu>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
12 years agolibata: skip old error history when counting probe trials
Lin Ming [Thu, 3 May 2012 14:15:07 +0000 (22:15 +0800)]
libata: skip old error history when counting probe trials

Commit d902747("[libata] Add ATA transport class") introduced
ATA_EFLAG_OLD_ER to mark entries in the error ring as cleared.

But ata_count_probe_trials_cb() didn't check this flag and it still
counts the old error history. So wrong probe trials count is returned
and it causes problem, for example, SATA link speed is slowed down from
3.0Gbps to 1.5Gbps.

Fix it by checking ATA_EFLAG_OLD_ER in ata_count_probe_trials_cb().

Cc: stable <stable@vger.kernel.org> # 2.6.37+
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
12 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net...
David S. Miller [Thu, 3 May 2012 17:30:11 +0000 (13:30 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next

12 years agoskb: Add skb_head_is_locked helper function
Alexander Duyck [Thu, 3 May 2012 01:09:42 +0000 (01:09 +0000)]
skb: Add skb_head_is_locked helper function

This patch adds support for a skb_head_is_locked helper function.  It is
meant to be used any time we are considering transferring the head from
skb->head to a paged frag.  If the head is locked it means we cannot remove
the head from the skb so it must be copied or we must take the skb as a
whole.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet: Fix truesize accounting in skb_gro_receive()
Eric Dumazet [Wed, 2 May 2012 23:33:21 +0000 (23:33 +0000)]
net: Fix truesize accounting in skb_gro_receive()

GRO is very optimistic in skb truesize estimates, only taking into
account the used part of fragments.

Be conservative, and use more precise computation, so that bloated GRO
skbs can be collapsed eventually.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>