openwrt/staging/blogic.git
15 years agotcp: handle shift/merge of cloned skbs too
Ilpo Järvinen [Tue, 25 Nov 2008 05:30:21 +0000 (21:30 -0800)]
tcp: handle shift/merge of cloned skbs too

This caused me to get repeatably:

  tcpdump: pcap_loop: recvfrom: Bad address

Happens occassionally when I tcpdump my for-looped test xfers:
  while [ : ]; do echo -n "$(date '+%s.%N') "; ./sendfile; sleep 20; done

Rest of the relevant commands:
  ethtool -K eth0 tso off
  tc qdisc add dev eth0 root netem drop 4%
  tcpdump -n -s0 -i eth0 -w sacklog.all

Running net-next under kvm, connection goes to the same host
(basically just out of kvm). The connection itself works ok
and data gets sent without corruption even with a large
number of tests while tcpdump fails usually within less than
5 tests.

Whether it only happens because of this change or not, I
don't know for sure but it's the only thing with which
I've seen that error. The non-cloned variant works w/o it
for much longer time. I'm yet to debug where the error
actually comes from.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotcp: add some mibs to track collapsing
Ilpo Järvinen [Tue, 25 Nov 2008 05:27:22 +0000 (21:27 -0800)]
tcp: add some mibs to track collapsing

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotcp: Make shifting not clear the hints
Ilpo Järvinen [Tue, 25 Nov 2008 05:26:56 +0000 (21:26 -0800)]
tcp: Make shifting not clear the hints

The earlier version was just very basic one which is "playing
safe" by always clearing the hints. However, clearing of a hint
is extremely costly operation with large windows, so it must be
avoided at all cost whenever possible, there is a way with
shifting too achieve not-clearing.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotcp: Try to restore large SKBs while SACK processing
Ilpo Järvinen [Tue, 25 Nov 2008 05:20:15 +0000 (21:20 -0800)]
tcp: Try to restore large SKBs while SACK processing

During SACK processing, most of the benefits of TSO are eaten by
the SACK blocks that one-by-one fragment SKBs to MSS sized chunks.
Then we're in problems when cleanup work for them has to be done
when a large cumulative ACK comes. Try to return back to pre-split
state already while more and more SACK info gets discovered by
combining newly discovered SACK areas with the previous skb if
that's SACKed as well.

This approach has a number of benefits:

1) The processing overhead is spread more equally over the RTT
2) Write queue has less skbs to process (affect everything
   which has to walk in the queue past the sacked areas)
3) Write queue is consistent whole the time, so no other parts
   of TCP has to be aware of this (this was not the case with
   some other approach that was, well, quite intrusive all
   around).
4) Clean_rtx_queue can release most of the pages using single
   put_page instead of previous PAGE_SIZE/mss+1 calls

In case a hole is fully filled by the new SACK block, we attempt
to combine the next skb too which allows construction of skbs
that are even larger than what tso split them to and it handles
hole per on every nth patterns that often occur during slow start
overshoot pretty nicely. Though this to be really useful also
a retransmission would have to get lost since cumulative ACKs
advance one hole at a time in the most typical case.

TODO: handle upwards only merging. That should be rather easy
when segment is fully sacked but I'm leaving that as future
work item (it won't make very large difference anyway since
this current approach already covers quite a lot of normal
cases).

I was earlier thinking of some sophisticated way of tracking
timestamps of the first and the last segment but later on
realized that it won't be that necessary at all to store the
timestamp of the last segment. The cases that can occur are
basically either:
  1) ambiguous => no sensible measurement can be taken anyway
  2) non-ambiguous is due to reordering => having the timestamp
     of the last segment there is just skewing things more off
     than does some good since the ack got triggered by one of
     the holes (besides some substle issues that would make
     determining right hole/skb even harder problem). Anyway,
     it has nothing to do with this change then.

I choose to route some abnormal looking cases with goto noop,
some could be handled differently (eg., by stopping the
walking at that skb but again). In general, they either
shouldn't happen at all or are rare enough to make no difference
in practice.

In theory this change (as whole) could cause some macroscale
regression (global) because of cache misses that are taken over
the round-trip time but it gets very likely better because of much
less (local) cache misses per other write queue walkers and the
big recovery clearing cumulative ack.

Worth to note that these benefits would be very easy to get also
without TSO/GSO being on as long as the data is in pages so that
we can merge them. Currently I won't let that happen because
DSACK splitting at fragment that would mess up pcounts due to
sk_can_gso in tcp_set_skb_tso_segs. Once DSACKs fragments gets
avoided, we have some conditions that can be made less strict.

TODO: I will probably have to convert the excessive pointer
passing to struct sacktag_state... :-)

My testing revealed that considerable amount of skbs couldn't
be shifted because they were cloned (most likely still awaiting
tx reclaim)...

[The rest is considering future work instead since I got
repeatably EFAULT to tcpdump's recvfrom when I added
pskb_expand_head to deal with clones, so I separated that
into another, later patch]

...To counter that, I gave up on the fifth advantage:

5) When growing previous SACK block, less allocs for new skbs
   are done, basically a new alloc is needed only when new hole
   is detected and when the previous skb runs out of frags space

...which now only happens of if reclaim is fast enough to dispose
the clone before the SACK block comes in (the window is RTT long),
otherwise we'll have to alloc some.

With clones being handled I got these numbers (will be somewhat
worse without that), taken with fine-grained mibs:

                  TCPSackShifted 398
                   TCPSackMerged 877
            TCPSackShiftFallback 320
      TCPSACKCOLLAPSEFALLBACKGSO 0
  TCPSACKCOLLAPSEFALLBACKSKBBITS 0
  TCPSACKCOLLAPSEFALLBACKSKBDATA 0
    TCPSACKCOLLAPSEFALLBACKBELOW 0
    TCPSACKCOLLAPSEFALLBACKFIRST 1
 TCPSACKCOLLAPSEFALLBACKPREVBITS 318
      TCPSACKCOLLAPSEFALLBACKMSS 1
   TCPSACKCOLLAPSEFALLBACKNOHEAD 0
    TCPSACKCOLLAPSEFALLBACKSHIFT 0
          TCPSACKCOLLAPSENOOPSEQ 0
  TCPSACKCOLLAPSENOOPSMALLPCOUNT 0
     TCPSACKCOLLAPSENOOPSMALLLEN 0
             TCPSACKCOLLAPSEHOLE 12

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotcp: make tcp_sacktag_one able to handle partial skb too
Ilpo Järvinen [Tue, 25 Nov 2008 05:14:43 +0000 (21:14 -0800)]
tcp: make tcp_sacktag_one able to handle partial skb too

This is preparatory work for SACK combiner patch which may
have to count TCP state changes for only a part of the skb
because it will intentionally avoids splitting skb to SACKed
and not sacked parts.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotcp: Make SACK code to split only at mss boundaries
Ilpo Järvinen [Tue, 25 Nov 2008 05:13:50 +0000 (21:13 -0800)]
tcp: Make SACK code to split only at mss boundaries

Sadly enough, this adds possible divide though we try to avoid
it by checking one mss as common case.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotcp: more aggressive skipping
Ilpo Järvinen [Tue, 25 Nov 2008 05:12:28 +0000 (21:12 -0800)]
tcp: more aggressive skipping

I knew already when rewriting the sacktag that this condition
was too conservative, change it now since it prevent lot of
useless work (especially in the sack shifter decision code
that is being added by a later patch). This shouldn't change
anything really, just save some processing regardless of the
shifter.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotcp: move tcp_simple_retransmit to tcp_input
Ilpo Järvinen [Tue, 25 Nov 2008 05:11:55 +0000 (21:11 -0800)]
tcp: move tcp_simple_retransmit to tcp_input

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotcp: collapse more than two on retransmission
Ilpo Järvinen [Tue, 25 Nov 2008 05:03:43 +0000 (21:03 -0800)]
tcp: collapse more than two on retransmission

I always had thought that collapsing up to two at a time was
intentional decision to avoid excessive processing if 1 byte
sized skbs are to be combined for a full mtu, and consecutive
retransmissions would make the size of the retransmittee
double each round anyway, but some recent discussion made me
to understand that was not the case. Thus make collapse work
more and wait less.

It would be possible to take advantage of the shifting
machinery (added in the later patch) in the case of paged
data but that can be implemented on top of this change.

tcp_skb_is_last check is now provided by the loop.

I tested a bit (ss-after-idle-off, fill 4096x4096B xfer,
10s sleep + 4096 x 1byte writes while dropping them for
some a while with netem):

16774097:16775545(1448) ack 1 win 46
16775545:16776993(1448) ack 1 win 46
. ack 16759617 win 2399
16776993:16777217(224) ack 1 win 46
. ack 16762513 win 2399
. ack 16765409 win 2399
. ack 16768305 win 2399
. ack 16771201 win 2399
. ack 16774097 win 2399
. ack 16776993 win 2399
. ack 16777217 win 2399
16777217:16777257(40) ack 1 win 46
. ack 16777257 win 2399
16777257:16778705(1448) ack 1 win 46
16778705:16780153(1448) ack 1 win 46
FP 16780153:16781313(1160) ack 1 win 46
. ack 16778705 win 2399
. ack 16780153 win 2399
F 1:1(0) ack 16781314 win 2399

While without drop-all period I get this:

16773585:16775033(1448) ack 1 win 46
. ack 16764897 win 9367
. ack 16767793 win 9367
. ack 16770689 win 9367
. ack 16773585 win 9367
16775033:16776481(1448) ack 1 win 46
16776481:16777217(736) ack 1 win 46
. ack 16776481 win 9367
. ack 16777217 win 9367
16777217:16777218(1) ack 1 win 46
16777218:16777219(1) ack 1 win 46
16777219:16777220(1) ack 1 win 46
  ...
16777247:16777248(1) ack 1 win 46
. ack 16777218 win 9367
. ack 16777219 win 9367
  ...
. ack 16777233 win 9367
. ack 16777248 win 9367
16777248:16778696(1448) ack 1 win 46
16778696:16780144(1448) ack 1 win 46
FP 16780144:16781313(1169) ack 1 win 46
. ack 16780144 win 9367
F 1:1(0) ack 16781314 win 9367

The window seems to be 30-40 segments, which were successfully
combined into: P 16777217:16777257(40) ack 1 win 46

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonet: avoid a pair of dst_hold()/dst_release() in ip_push_pending_frames()
Eric Dumazet [Tue, 25 Nov 2008 00:07:50 +0000 (16:07 -0800)]
net: avoid a pair of dst_hold()/dst_release() in ip_push_pending_frames()

We can reduce pressure on dst entry refcount that slowdown UDP transmit
path on SMP machines. This pressure is visible on RTP servers when
delivering content to mediagateways, especially big ones, handling
thousand of streams. Several cpus send UDP frames to the same
destination, hence use the same dst entry.

This patch makes ip_push_pending_frames() steal the refcount its
callers had to take when filling inet->cork.dst.

This doesnt avoid all refcounting, but still gives speedups on SMP,
on UDP/RAW transmit path.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonet: avoid a pair of dst_hold()/dst_release() in ip_append_data()
Eric Dumazet [Mon, 24 Nov 2008 23:52:46 +0000 (15:52 -0800)]
net: avoid a pair of dst_hold()/dst_release() in ip_append_data()

We can reduce pressure on dst entry refcount that slowdown UDP transmit
path on SMP machines. This pressure is visible on RTP servers when
delivering content to mediagateways, especially big ones, handling
thousand of streams. Several cpus send UDP frames to the same
destination, hence use the same dst entry.

This patch makes ip_append_data() eventually steal the refcount its
callers had to take on the dst entry.

This doesnt avoid all refcounting, but still gives speedups on SMP,
on UDP/RAW transmit path

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonet: gen_estimator: Fix gen_kill_estimator() lookups
Jarek Poplawski [Mon, 24 Nov 2008 23:48:05 +0000 (15:48 -0800)]
net: gen_estimator: Fix gen_kill_estimator() lookups

gen_kill_estimator() linear lists lookups are very slow, and e.g. while
deleting a large number of HTB classes soft lockups were reported. Here
is another try to fix this problem: this time internally, with rbtree,
so similarly to Jamal's hashing idea IIRC. (Looking for next hits could
be still optimized, but it's really fast as it is.)

Reported-by: Badalian Vyacheslav <slavon@bigtelecom.ru>
Reported-by: Denys Fedoryshchenko <denys@visp.net.lb>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agopkt_sched: sch_drr: fix drr_dequeue loop()
Patrick McHardy [Mon, 24 Nov 2008 23:46:08 +0000 (15:46 -0800)]
pkt_sched: sch_drr: fix drr_dequeue loop()

Jarek Poplawski points out:

If all child qdiscs of sch_drr are non-work-conserving (e.g. sch_tbf)
drr_dequeue() will busy-loop waiting for skbs instead of leaving the
job for a watchdog. Checking for list_empty() in each loop isn't
necessary either, because this can never be true except the first time.

Using non-work-conserving qdiscs as children of DRR makes no sense,
simply bail out in that case.

Reported-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoinfiniband: Kill directly reference of netdev->priv
Wang Chen [Mon, 24 Nov 2008 23:34:00 +0000 (15:34 -0800)]
infiniband: Kill directly reference of netdev->priv

This use of netdev->priv is wrong.
The right way is:
alloc_netdev() with no memory for private data.
make netdev->ml_priv to point to c2_dev.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Acked-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonetdevice sbni: Convert directly reference of netdev->priv
Wang Chen [Mon, 24 Nov 2008 22:52:16 +0000 (14:52 -0800)]
netdevice sbni: Convert directly reference of netdev->priv

1. convert netdev->priv to netdev_priv().
2. make sbni_pci_probe() be static.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotokenring/3c359.c: Prevent possible mem leak when open failed
Jirka Pirko [Mon, 24 Nov 2008 22:49:11 +0000 (14:49 -0800)]
tokenring/3c359.c: Prevent possible mem leak when open failed

Freeing previously allocated buffers in case of error.

Signed-off-by: Jirka Pirko <jirka@pirko.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotokenring/3c359.c: Fix error message when allocating tx_ring
Jirka Pirko [Mon, 24 Nov 2008 22:48:25 +0000 (14:48 -0800)]
tokenring/3c359.c: Fix error message when allocating tx_ring

Pointed out by Joe Perches. Error message after tx_ring allocation check was
wrong.

Signed-off-by: Jirka Pirko <jirka@jirka.pirko.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotokenring/3c359.c: fix allocation null check
Jirka Pirko [Mon, 24 Nov 2008 22:47:53 +0000 (14:47 -0800)]
tokenring/3c359.c: fix allocation null check

Fixed typo when allocating rx_ring, tx_ring was checked for null instead.

Signed-off-by: Jirka Pirko <jirka@pirko.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years ago8139too: use err.h macros
Stephen Hemminger [Mon, 24 Nov 2008 22:47:01 +0000 (14:47 -0800)]
8139too: use err.h macros

Instead of using call by reference use the PTR_ERR macros to handle
return value with error case. Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonet: Make sure BHs are disabled in sock_prot_inuse_add()
Eric Dumazet [Mon, 24 Nov 2008 22:05:22 +0000 (14:05 -0800)]
net: Make sure BHs are disabled in sock_prot_inuse_add()

There is still a call to sock_prot_inuse_add() in af_netlink
while in a preemptable section. Add explicit BH disable around
this call.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonet: Make sure BHs are disabled in sock_prot_inuse_add()
Eric Dumazet [Mon, 24 Nov 2008 08:09:29 +0000 (00:09 -0800)]
net: Make sure BHs are disabled in sock_prot_inuse_add()

The rule of calling sock_prot_inuse_add() is that BHs must
be disabled.  Some new calls were added where this was not
true and this tiggers warnings as reported by Ilpo.

Fix this by adding explicit BH disabling around those call sites,
or moving sock_prot_inuse_add() call inside an existing BH disabled
section.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoeth: Declare an optimized compare_ether_addr_64bits() function
Eric Dumazet [Mon, 24 Nov 2008 07:24:32 +0000 (23:24 -0800)]
eth: Declare an optimized compare_ether_addr_64bits() function

Linus mentioned we could try to perform long word operations, even
on potentially unaligned addresses, on x86 at least. David mentioned
the HAVE_EFFICIENT_UNALIGNED_ACCESS test to handle this on all
arches that have efficient unailgned accesses.

I tried this idea and got nice assembly on 32 bits:

158:   33 82 38 01 00 00       xor    0x138(%edx),%eax
15e:   33 8a 34 01 00 00       xor    0x134(%edx),%ecx
164:   c1 e0 10                shl    $0x10,%eax
167:   09 c1                   or     %eax,%ecx
169:   74 0b                   je     176 <eth_type_trans+0x87>

And very nice assembly on 64 bits of course (one xor, one shl)

Nice oprofile improvement in eth_type_trans(), 0.17 % instead of 0.41 %,
expected since we remove 8 instructions on a fast path.

This patch implements a compare_ether_addr_64bits() function, that
uses the CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS ifdef to efficiently
perform the 6 bytes comparison on all capable arches.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoaxnet_cs: Fix build after net device ops ne2k conversion.
David S. Miller [Mon, 24 Nov 2008 04:01:59 +0000 (20:01 -0800)]
axnet_cs: Fix build after net device ops ne2k conversion.

Commit 4e4fd4e485ad63a9074ff09a9b53ffc7a5c594ec ("ne2k: convert to
net_device_ops") exported some ei_* symbols from the 8390 library,
but the axnet_cs driver defines local static versions of the same
functions.

Rename them to avoid the namespace conflict.

Reported by Stephen Rothwell.

Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonet: Make sure BHs are disabled in sock_prot_inuse_add()
David S. Miller [Mon, 24 Nov 2008 01:34:03 +0000 (17:34 -0800)]
net: Make sure BHs are disabled in sock_prot_inuse_add()

The rule of calling sock_prot_inuse_add() is that BHs must
be disabled.  Some new calls were added where this was not
true and this tiggers warnings as reported by Ilpo.

Fix this by adding explicit BH disabling around those call sites.

Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonet: fix tunnels in netns after ndo_ changes
Alexey Dobriyan [Mon, 24 Nov 2008 01:26:26 +0000 (17:26 -0800)]
net: fix tunnels in netns after ndo_ changes

dev_net_set() should be the very first thing after alloc_netdev().

"ndo_" changes turned simple assignment (which is OK to do before netns
assignment) into quite non-trivial operation (which is not OK, init_net was
used). This leads to incomplete initialisation of tunnel device in netns.

BUG: unable to handle kernel NULL pointer dereference at 00000004
IP: [<c02efdb5>] ip6_tnl_exit_net+0x37/0x4f
*pde = 00000000
Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC
last sysfs file: /sys/class/net/lo/operstate

Pid: 10, comm: netns Not tainted (2.6.28-rc6 #1)
EIP: 0060:[<c02efdb5>] EFLAGS: 00010246 CPU: 0
EIP is at ip6_tnl_exit_net+0x37/0x4f
EAX: 00000000 EBX: 00000020 ECX: 00000000 EDX: 00000003
ESI: c5caef30 EDI: c782bbe8 EBP: c7909f50 ESP: c7909f48
 DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
Process netns (pid: 10, ti=c7908000 task=c7905780 task.ti=c7908000)
Stack:
 c03e75e0 c7390bc8 c7909f60 c0245448 c7390bd8 c7390bf0 c7909fa8 c012577a
 00000000 00000002 00000000 c0125736 c782bbe8 c7909f90 c0308fe3 c782bc04
 c7390bd4 c0245406 c084b718 c04f0770 c03ad785 c782bbe8 c782bc04 c782bc0c
Call Trace:
 [<c0245448>] ? cleanup_net+0x42/0x82
 [<c012577a>] ? run_workqueue+0xd6/0x1ae
 [<c0125736>] ? run_workqueue+0x92/0x1ae
 [<c0308fe3>] ? schedule+0x275/0x285
 [<c0245406>] ? cleanup_net+0x0/0x82
 [<c0125ae1>] ? worker_thread+0x81/0x8d
 [<c0128344>] ? autoremove_wake_function+0x0/0x33
 [<c0125a60>] ? worker_thread+0x0/0x8d
 [<c012815c>] ? kthread+0x39/0x5e
 [<c0128123>] ? kthread+0x0/0x5e
 [<c0103b9f>] ? kernel_thread_helper+0x7/0x10
Code: db e8 05 ff ff ff 89 c6 e8 dc 04 f6 ff eb 08 8b 40 04 e8 38 89 f5 ff 8b 44 9e 04 85 c0 75 f0 43 83 fb 20 75 f2 8b 86 84 00 00 00 <8b> 40 04 e8 1c 89 f5 ff e8 98 04 f6 ff 89 f0 e8 f8 63 e6 ff 5b
EIP: [<c02efdb5>] ip6_tnl_exit_net+0x37/0x4f SS:ESP 0068:c7909f48
---[ end trace 6c2f2328fccd3e0c ]---

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonet: Convert TCP/DCCP listening hash tables to use RCU
Eric Dumazet [Mon, 24 Nov 2008 01:22:55 +0000 (17:22 -0800)]
net: Convert TCP/DCCP listening hash tables to use RCU

This is the last step to be able to perform full RCU lookups
in __inet_lookup() : After established/timewait tables, we
add RCU lookups to listening hash table.

The only trick here is that a socket of a given type (TCP ipv4,
TCP ipv6, ...) can now flight between two different tables
(established and listening) during a RCU grace period, so we
must use different 'nulls' end-of-chain values for two tables.

We define a large value :

#define LISTENING_NULLS_BASE (1U << 29)

So that slots in listening table are guaranteed to have different
end-of-chain values than slots in established table. A reader can
still detect it finished its lookup in the right chain.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agodccp: Header option insertion routine for feature-negotiation
Gerrit Renker [Mon, 24 Nov 2008 00:10:23 +0000 (16:10 -0800)]
dccp: Header option insertion routine for feature-negotiation

The patch extends existing code:
 * Confirm options divide into the confirmed value plus an optional preference
   list for SP values. Previously only the preference list was echoed for SP
   values, now the confirmed value is added as per RFC 4340, 6.1;
 * length and sanity checks are added to avoid illegal memory (or NULL) access.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agodccp: Support for Mandatory options
Gerrit Renker [Mon, 24 Nov 2008 00:09:11 +0000 (16:09 -0800)]
dccp: Support for Mandatory options

Support for Mandatory options is provided by this patch, which will
be used by subsequent feature-negotiation patches.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agodccp: Increase the scope of variable-length htonl/ntohl functions
Gerrit Renker [Mon, 24 Nov 2008 00:07:53 +0000 (16:07 -0800)]
dccp: Increase the scope of variable-length htonl/ntohl functions

This extends the scope of two available functions,
encode|decode_value_var, to work up to 6 (8) bytes, to match maximum
requirements in the RFC.

These functions are going to be used both by general option processing
and feature negotiation code, hence declarations have been put into
feat.h.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agodccp: API to query the current TX/RX CCID
Gerrit Renker [Mon, 24 Nov 2008 00:04:59 +0000 (16:04 -0800)]
dccp: API to query the current TX/RX CCID

This provides function to query the current TX/RX CCID dynamically,
without reliance on the minisock value, using dynamic information
available in the currently loaded CCID module.

This query function is then used to
 (a) provide the getsockopt part for getting/setting CCIDs via sockopts;
 (b) replace the current test for "which CCID is in use" in probe.c.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agodccp: Set per-connection CCIDs via socket options
Gerrit Renker [Mon, 24 Nov 2008 00:02:31 +0000 (16:02 -0800)]
dccp: Set per-connection CCIDs via socket options

With this patch, TX/RX CCIDs can now be changed on a per-connection
basis, which overrides the defaults set by the global sysctl variables
for TX/RX CCIDs.

To make full use of this facility, the remaining patches of this patch
set are needed, which track dependencies and activate negotiated
feature values.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agomyri10ge: update firmware headers
Brice Goglin [Sun, 23 Nov 2008 23:49:54 +0000 (15:49 -0800)]
myri10ge: update firmware headers

Update myri10ge firmware headers.

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agomyri10ge: update DCA comments
Brice Goglin [Sun, 23 Nov 2008 23:49:28 +0000 (15:49 -0800)]
myri10ge: update DCA comments

Update DCA sections closing comments.

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonet: af_netlink should update its inuse counter
Eric Dumazet [Sun, 23 Nov 2008 23:48:22 +0000 (15:48 -0800)]
net: af_netlink should update its inuse counter

In order to have relevant information for NETLINK protocol, in
/proc/net/protocols, we should use sock_prot_inuse_add() to
update a (percpu and pernamespace) counter of inuse sockets.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonet: some optimizations in af_inet
Eric Dumazet [Sun, 23 Nov 2008 23:42:23 +0000 (15:42 -0800)]
net: some optimizations in af_inet

1) Use eq_net() in inet_netns_ok() to speedup socket creation if
   !CONFIG_NET_NS

2) Reorder the tests about inet_ehash_secret generation (once only)
   Use the unlikely() macro when testing if inet_ehash_secret already
   generated.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoMerge branch 'for-david' of git://git.kernel.org/pub/scm/linux/kernel/git/chris/linux-2.6
David S. Miller [Sat, 22 Nov 2008 05:30:58 +0000 (21:30 -0800)]
Merge branch 'for-david' of git://git./linux/kernel/git/chris/linux-2.6

15 years agoigb: do not use phy ops in ethtool test cleanup for non-copper parts
Alexander Duyck [Sat, 22 Nov 2008 05:30:24 +0000 (21:30 -0800)]
igb: do not use phy ops in ethtool test cleanup for non-copper parts

Currently the igb driver is experiencing a panic due to a null function
pointer being used during the cleanup of the ethtool looback test on
fiber/serdes parts.  This patch prevents that and adds a check prior to
calling any phy function.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoenic: misc cleanup items:
Scott Feldman [Sat, 22 Nov 2008 05:29:25 +0000 (21:29 -0800)]
enic: misc cleanup items:

Clarrify reading PBA has no side-effect (clearing).
Add missing GPL license text.

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoenic: move wmb closer to where needed: before writing posted_index to hw
Scott Feldman [Sat, 22 Nov 2008 05:29:01 +0000 (21:29 -0800)]
enic: move wmb closer to where needed: before writing posted_index to hw

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoenic: mask off some reserved bits in CQ descriptor for future use
Scott Feldman [Sat, 22 Nov 2008 05:28:40 +0000 (21:28 -0800)]
enic: mask off some reserved bits in CQ descriptor for future use

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoenic: driver/firmware API updates
Scott Feldman [Sat, 22 Nov 2008 05:28:18 +0000 (21:28 -0800)]
enic: driver/firmware API updates

Add driver/firmware compatibility check.
Update firmware notify cmd to honor notify area size.
Add new version of init cmd.
Add link_down_cnt to notify area to track link down count.

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoenic: enable ethtool LRO support
Scott Feldman [Sat, 22 Nov 2008 05:26:55 +0000 (21:26 -0800)]
enic: enable ethtool LRO support

Enable ethtool support for get/set_flags so LRO can be turned on/off
by fwding drivers such as the bridge driver.  LRO is not compatible
with fwding drivers.

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoWAN pc300too.c: Fix PC300-X.21 detection
Krzysztof Hałasa [Thu, 20 Nov 2008 14:51:05 +0000 (15:51 +0100)]
WAN pc300too.c: Fix PC300-X.21 detection

pc300too driver works around a bug in PCI9050 bridge.  Unfortunately
it was doing that too late.

Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
15 years agoWAN: syncppp.c is no longer used by any kernel code. Remove it.
Krzysztof Hałasa [Thu, 14 Aug 2008 17:18:17 +0000 (19:18 +0200)]
WAN: syncppp.c is no longer used by any kernel code. Remove it.

Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
15 years agoWAN: new synchronous PPP implementation for generic HDLC.
Krzysztof Hałasa [Thu, 14 Aug 2008 17:17:38 +0000 (19:17 +0200)]
WAN: new synchronous PPP implementation for generic HDLC.

Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
15 years agoWAN: Simplify sca_init_port() in HD64572 driver.
Krzysztof Hałasa [Thu, 10 Jul 2008 22:13:09 +0000 (00:13 +0200)]
WAN: Simplify sca_init_port() in HD64572 driver.

Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
15 years agoWAN: Correct comments in hd6457[02].c
Krzysztof Hałasa [Wed, 9 Jul 2008 22:30:51 +0000 (00:30 +0200)]
WAN: Correct comments in hd6457[02].c

Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
15 years agoWAN: HD64572 drivers don't use next_desc() anymore.
Krzysztof Hałasa [Wed, 9 Jul 2008 21:39:12 +0000 (23:39 +0200)]
WAN: HD64572 drivers don't use next_desc() anymore.

Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
15 years agoWAN: Simplify HD64572 drivers.
Krzysztof Hałasa [Wed, 9 Jul 2008 21:13:49 +0000 (23:13 +0200)]
WAN: Simplify HD64572 drivers.

Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
15 years agoWAN: don't print HD64572 driver versions anymore.
Krzysztof Hałasa [Wed, 9 Jul 2008 19:30:17 +0000 (21:30 +0200)]
WAN: don't print HD64572 driver versions anymore.

Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
15 years agoWAN: Simplify HD64572 status handling.
Krzysztof Hałasa [Wed, 9 Jul 2008 19:24:42 +0000 (21:24 +0200)]
WAN: Simplify HD64572 status handling.

Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
15 years agoWAN: rework HD64572 interrupts a bit.
Krzysztof Hałasa [Wed, 9 Jul 2008 17:28:45 +0000 (19:28 +0200)]
WAN: rework HD64572 interrupts a bit.
Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
15 years agoWAN: HD64572 already handles TX underruns with DMAC.
Krzysztof Hałasa [Wed, 9 Jul 2008 18:01:23 +0000 (20:01 +0200)]
WAN: HD64572 already handles TX underruns with DMAC.

Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
15 years agoWAN: TX-done handler now uses the ownership bit in HD64572 drivers.
Krzysztof Hałasa [Wed, 9 Jul 2008 17:47:05 +0000 (19:47 +0200)]
WAN: TX-done handler now uses the ownership bit in HD64572 drivers.

Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
15 years agoWAN: convert HD64572-based drivers to NAPI.
Krzysztof Hałasa [Wed, 9 Jul 2008 14:49:37 +0000 (16:49 +0200)]
WAN: convert HD64572-based drivers to NAPI.

Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
15 years agoWAN: remove SCA support from SCA-II drivers
Krzysztof Hałasa [Mon, 24 Mar 2008 19:24:23 +0000 (20:24 +0100)]
WAN: remove SCA support from SCA-II drivers

Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
15 years agoWAN: remove SCA II support from SCA drivers
Krzysztof Hałasa [Mon, 24 Mar 2008 18:12:23 +0000 (19:12 +0100)]
WAN: remove SCA II support from SCA drivers

Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
15 years agoWAN: split hd6457x.c into hd64570.c and hd64572.c
Krzysztof Hałasa [Mon, 24 Mar 2008 15:39:02 +0000 (16:39 +0100)]
WAN: split hd6457x.c into hd64570.c and hd64572.c

Supporting both original SCA and SCA-II in one file was nice at some
point but now it's increasingly painful.

Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
15 years agone2k: convert to net_device_ops
Stephen Hemminger [Sat, 22 Nov 2008 01:39:02 +0000 (17:39 -0800)]
ne2k: convert to net_device_ops

Convert driver to new net_device_ops. Compile tested only.
This required some additional work to export common code ei_XXX.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoeql: convert to net_device_ops
Stephen Hemminger [Sat, 22 Nov 2008 01:37:54 +0000 (17:37 -0800)]
eql: convert to net_device_ops

Convert driver to new net_device_ops. Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agosc92031: convert to net_device_ops
Stephen Hemminger [Sat, 22 Nov 2008 01:37:24 +0000 (17:37 -0800)]
sc92031: convert to net_device_ops

Convert this driver to net_device_ops. Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoqla3xxx: convert to net_device_ops
Stephen Hemminger [Sat, 22 Nov 2008 01:36:58 +0000 (17:36 -0800)]
qla3xxx: convert to net_device_ops

Convert this driver to net_device_ops. Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agohamachi: convert to net_device_ops
Stephen Hemminger [Sat, 22 Nov 2008 01:36:36 +0000 (17:36 -0800)]
hamachi: convert to net_device_ops

Convert driver to new net_device_ops. Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agobnx2x: convert to net_device_ops
Stephen Hemminger [Sat, 22 Nov 2008 01:36:04 +0000 (17:36 -0800)]
bnx2x: convert to net_device_ops

Convert driver to new net_device_ops. Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agons83820: convert to net_device_ops
Stephen Hemminger [Sat, 22 Nov 2008 01:35:40 +0000 (17:35 -0800)]
ns83820: convert to net_device_ops

Convert driver to new net_device_ops. Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoyellowfin: convert to net_device_ops
Stephen Hemminger [Sat, 22 Nov 2008 01:35:16 +0000 (17:35 -0800)]
yellowfin: convert to net_device_ops

Convert driver to new net_device_ops. Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agor6040: convert to net_device_ops
Stephen Hemminger [Sat, 22 Nov 2008 01:34:56 +0000 (17:34 -0800)]
r6040: convert to net_device_ops

Convert driver to new net_device_ops. Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agosis900: convert to net_device_ops
Stephen Hemminger [Sat, 22 Nov 2008 01:34:32 +0000 (17:34 -0800)]
sis900: convert to net_device_ops

Convert driver to new net_device_ops. Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotehuti: convert to net_device_ops
Stephen Hemminger [Sat, 22 Nov 2008 01:34:09 +0000 (17:34 -0800)]
tehuti: convert to net_device_ops

Convert driver to new net_device_ops. Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agosfc: convert to net_device_ops
Stephen Hemminger [Sat, 22 Nov 2008 01:32:54 +0000 (17:32 -0800)]
sfc: convert to net_device_ops

Convert driver to new net_device_ops. Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonetxen: convert to net_device_ops
Stephen Hemminger [Sat, 22 Nov 2008 01:32:15 +0000 (17:32 -0800)]
netxen: convert to net_device_ops

Convert driver to new net_device_ops. Compile tested only.
Had to do some refactoring on multicast_list.
Fix ethtool restart to propogate error code.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agodl2k: convert to net_device_ops
Stephen Hemminger [Sat, 22 Nov 2008 01:31:51 +0000 (17:31 -0800)]
dl2k: convert to net_device_ops

Convert driver to new net_device_ops. Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agobnx2: convert to net_device_ops
Stephen Hemminger [Sat, 22 Nov 2008 01:31:27 +0000 (17:31 -0800)]
bnx2: convert to net_device_ops

Convert driver to new net_device_ops. Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agomlx4: convert to net_device_ops
Stephen Hemminger [Sat, 22 Nov 2008 01:30:58 +0000 (17:30 -0800)]
mlx4: convert to net_device_ops

Convert driver to new net_device_ops. Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agomyri10ge: convert to net_device_ops
Stephen Hemminger [Sat, 22 Nov 2008 01:30:35 +0000 (17:30 -0800)]
myri10ge: convert to net_device_ops

Convert driver to new net_device_ops. Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agovia-rhine: convert to net_device_ops
Stephen Hemminger [Sat, 22 Nov 2008 01:30:11 +0000 (17:30 -0800)]
via-rhine: convert to net_device_ops

Convert driver to new net_device_ops. Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoqlge: fix sparse warnings
Stephen Hemminger [Sat, 22 Nov 2008 01:29:50 +0000 (17:29 -0800)]
qlge: fix sparse warnings

Fix sparse warnings and one bug:
    * Several routines can be static
    * Don't lose __iomem annotation
    * fix locking on error path (bug)

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoqlge: convert to net_device_ops
Stephen Hemminger [Sat, 22 Nov 2008 01:29:16 +0000 (17:29 -0800)]
qlge: convert to net_device_ops

Convert driver to new net_device_ops. Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agos2io: convert to net_device_ops
Stephen Hemminger [Sat, 22 Nov 2008 01:28:55 +0000 (17:28 -0800)]
s2io: convert to net_device_ops

Convert this driver to network device ops. Compile teseted only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agojme: convert driver to net_device_ops
Stephen Hemminger [Sat, 22 Nov 2008 01:28:33 +0000 (17:28 -0800)]
jme: convert driver to net_device_ops

Convert driver to new net_device_ops. Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotg3: Update version to 3.96
Matt Carlson [Sat, 22 Nov 2008 01:23:26 +0000 (17:23 -0800)]
tg3: Update version to 3.96

This patch updates the version number to 3.96.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agobroadcom: Add 57780 support
Matt Carlson [Sat, 22 Nov 2008 01:22:53 +0000 (17:22 -0800)]
broadcom: Add 57780 support

This patch adds the 57780 PHY ID to the broadcom module.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotg3: Add 57780 support
Matt Carlson [Sat, 22 Nov 2008 01:22:19 +0000 (17:22 -0800)]
tg3: Add 57780 support

This patch adds support for the 57780 ASIC revision.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotg3: Allow GPHY powerdown on 5761
Matt Carlson [Sat, 22 Nov 2008 01:21:13 +0000 (17:21 -0800)]
tg3: Allow GPHY powerdown on 5761

The ENABLE_APE flag tells the driver whether or not the device has an
Application Processing Engine (APE).  The APE does not need the PHY to
be powered unless it is running management firmware.  For backwards
compatibility, management firmware will still set the ENABLE_ASF bit.
Consequently, there is no reason to consider the ENABLE_APE flag when
deciding whether or not to power down the phy.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotg3: Embrace pci_ioremap_bar()
Matt Carlson [Sat, 22 Nov 2008 01:20:32 +0000 (17:20 -0800)]
tg3: Embrace pci_ioremap_bar()

Per Dave Miller's suggestion, replace the remaining ioremap_nocache()
call with pci_ioremap_bar().  Remove the two IORESOURCE_MEM checks as
they are redundant.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotg3: Extract FW ver from alt NVRAM formats
Matt Carlson [Sat, 22 Nov 2008 01:19:41 +0000 (17:19 -0800)]
tg3: Extract FW ver from alt NVRAM formats

This patch extracts the bootcode firmware version from the alternate
selfboot patch NVRAM format.  This format is used on the 5784, 5761 and
some newer devices.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotg3: Enable GPHY APD on select devices
Matt Carlson [Sat, 22 Nov 2008 01:18:59 +0000 (17:18 -0800)]
tg3: Enable GPHY APD on select devices

GPHY Autopowerdown (APD) is a way to save power when energy is not
detected on the wire.  At the moment, only the 5784 and 5761 are
capable of enabling this mode.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotg3: Prevent corruption at 10 / 100Mbps w CLKREQ
Matt Carlson [Sat, 22 Nov 2008 01:18:16 +0000 (17:18 -0800)]
tg3: Prevent corruption at 10 / 100Mbps w CLKREQ

This patch disables CLKREQ at 10Mbps and 100Mbps to workaround a TX BD
corruption issue.  This problem only affects the 5784 and 5761 (and
57780 AX) ASIC revisions.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotg3: Qualify use of tp->pcix_cap
Matt Carlson [Sat, 22 Nov 2008 01:17:04 +0000 (17:17 -0800)]
tg3: Qualify use of tp->pcix_cap

This patch makes sure the device is a PCIX device before attempting to
use the pcix_cap device structure member.  This is prep work for the
following patch.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotg3: Use NET_IP_ALIGN
Matt Carlson [Sat, 22 Nov 2008 01:16:16 +0000 (17:16 -0800)]
tg3: Use NET_IP_ALIGN

This patch replaces hardcoded 2's with the NET_IP_ALIGN constant or
TG3_RAW_IP_ALIGN where appropriate.  Some platforms can redefine the
NET_IP_ALIGN definition to zero if unaligned DMA transfers cost more
than the IP header alignment gains.  This patch represents a
performance improvement when using the 5701 on these platforms.
The copy path can be avoided.

TG3_RAW_IP_ALIGN is used in cases where we always want to align the
IP header on dword boundaries.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonet: remove redundant argument comments
Qinghuang Feng [Sat, 22 Nov 2008 01:15:03 +0000 (17:15 -0800)]
net: remove redundant argument comments

Remove redundant argument comments in files of net/*

Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wirel...
David S. Miller [Sat, 22 Nov 2008 01:05:11 +0000 (17:05 -0800)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-next-2.6

15 years agoe1000e: check return code from NVM accesses and fix bank detection
Bruce Allan [Sat, 22 Nov 2008 01:02:41 +0000 (17:02 -0800)]
e1000e: check return code from NVM accesses and fix bank detection

Check return code for all NVM accesses[1] and error out accordingly; log
a debug message for failed accesses.

For ICH8/9, the valid NVM bank detect function was not checking whether the
SEC1VAL (sector 1 valid) bit in the EECD register was itself valid (bits 8
and 9 also have to be set).  If invalid, it would have defaulted to the
possibly invalid bank 0.  Instead, try to use the valid bank detection
method used by ICH10 which has been cleaned up a bit.

[1] - reads and updates only; not writes because those are only writing to
the Shadow RAM, the update following the write is the only thing actually
writing the modified Shadow RAM contents to the NVM.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoe1000e: fix incorrect link status when switch module pulled
Bruce Allan [Sat, 22 Nov 2008 01:01:35 +0000 (17:01 -0800)]
e1000e: fix incorrect link status when switch module pulled

On 82571 with SerDes, the true link state is not always correct when read
from the STATUS register; use existing e1000_has_link() function instead.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoe1000e: store EEPROM version number to prevent unnecessary NVM reads
Bruce Allan [Sat, 22 Nov 2008 01:00:22 +0000 (17:00 -0800)]
e1000e: store EEPROM version number to prevent unnecessary NVM reads

Rather than reading the NVM to get the EEPROM version number everytime the
ethool get_drvinfo function is called, read it once during probe and save
it for future reference.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoe1000e: cosmetic newline in debug message
Bruce Allan [Sat, 22 Nov 2008 00:59:54 +0000 (16:59 -0800)]
e1000e: cosmetic newline in debug message

Add missing newline from debug message.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoe1000e: sync change flow control variables with ixgbe
Bruce Allan [Sat, 22 Nov 2008 00:57:36 +0000 (16:57 -0800)]
e1000e: sync change flow control variables with ixgbe

Sync flow control variables and usage model with that found in the ixgbe
driver.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoe1000e: link up/down messages must follow a specific format
Bruce Allan [Sat, 22 Nov 2008 00:54:43 +0000 (16:54 -0800)]
e1000e: link up/down messages must follow a specific format

The system log messages created on a link status change need to follow a
specific format to work with tools some customers use.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoe1000e: ESB2 config after link up
Bruce Allan [Sat, 22 Nov 2008 00:53:51 +0000 (16:53 -0800)]
e1000e: ESB2 config after link up

On ESB2, the MAC-to-PHY (Kumeran) interface must be configured after link
is up before any traffic is sent; a new PHY operations function pointer is
provided for this.  To facilitate read/write of the Kumeran registers
without blocking PHY register writes, the driver/firmware synchronization
method which previously used a hardware semaphore for both PHY and Kumeran
register accesses is now split.  New Kumeran register read/write functions
utilize this new synchronization method.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoe1000e: check return of pci_save_state
Bruce Allan [Sat, 22 Nov 2008 00:51:33 +0000 (16:51 -0800)]
e1000e: check return of pci_save_state

Check return of pci_save_state and error out accordingly.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>