Eric Dumazet [Tue, 19 May 2009 05:19:19 +0000 (22:19 -0700)]
net: release dst entry in dev_hard_start_xmit()
One point of contention in high network loads is the dst_release() performed
when a transmited skb is freed. This is because NIC tx completion calls
dev_kree_skb() long after original call to dev_queue_xmit(skb).
CPU cache is cold and the atomic op in dst_release() stalls. On SMP, this is
quite visible if one CPU is 100% handling softirqs for a network device,
since dst_clone() is done by other cpus, involving cache line ping pongs.
It seems right place to release dst is in dev_hard_start_xmit(), for most
devices but ones that are virtual, and some exceptions.
David Miller suggested to define a new device flag, set in alloc_netdev_mq()
(so that most devices set it at init time), and carefuly unset in devices
which dont want a NULL skb->dst in their ndo_start_xmit().
List of devices that must clear this flag is :
- loopback device, because it calls netif_rx() and quoting Patrick :
"ip_route_input() doesn't accept loopback addresses, so loopback packets
already need to have a dst_entry attached."
- appletalk/ipddp.c : needs skb->dst in its xmit function
- And all devices that call again dev_queue_xmit() from their xmit function
(as some classifiers need skb->dst) : bonding, vlan, macvlan, eql, ifb, hdlc_fr
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric W. Biederman [Wed, 13 May 2009 17:02:50 +0000 (17:02 +0000)]
net: FIX bonding sysfs rtnl_lock deadlock
Sysfs files for a network device can not unconditionally take the
rtnl_lock as the bonding sysfs files do. If someone accesses those
sysfs files while the network device is being unregistered with the
rtnl_lock held we will deadlock.
So use trylock and restart_syscall to avoid this problem.
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric W. Biederman [Wed, 13 May 2009 17:01:51 +0000 (17:01 +0000)]
net: Fix ipoib rtnl_lock sysfs deadlock.
Network device sysfs files that grab the rtnl_lock unconditionally
will deadlock if accessed when the network device is being
unregistered. So use trylock and syscall_restart to avoid this
deadlock.
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric W. Biederman [Wed, 13 May 2009 17:00:41 +0000 (17:00 +0000)]
net: Fix bridgeing sysfs handling of rtnl_lock
Holding rtnl_lock when we are unregistering the sysfs files can
deadlock if we unconditionally take rtnl_lock in a sysfs file. So fix
it with the now familiar patter of: rtnl_trylock and syscall_restart()
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric W. Biederman [Wed, 13 May 2009 16:59:21 +0000 (16:59 +0000)]
net: Fix devinet_sysctl_forward
sysctls are unregistered with the rntl_lock held making
it unsafe to unconditionally grab the the rtnl_lock. Instead
we need to call rtnl_trylock and restart the system call
if we can not grab it. Otherwise we could deadlock at unregistration
time.
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric W. Biederman [Wed, 13 May 2009 16:58:17 +0000 (16:58 +0000)]
net: FIX ipv6_forward sysctl restart
Just returning -ERESTARTSYS without a signal pending is not
good that will just leak it to userspace. We need return
-ERESTARTNOINTR so we always restart and set signal pending
so that we fall of the fast path of syscall return and setup
the system call restart.
So use restart_syscall() which does all of this for us.
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric W. Biederman [Wed, 13 May 2009 16:57:25 +0000 (16:57 +0000)]
net-sysfs: Use rtnl_trylock in sysfs methods.
The earlier patch to fix the deadlock between a network device going
away and writing to sysfs attributes was incomplete.
- It did not set signal_pending so we would leak ERSTARTSYS to user space.
- It used ERESTARTSYS which only restarts if sigaction configures it to.
- It did not cover store and show for ifalias.
So fix all of these up and use the new helper restart_syscall so we get
the details correct on what it takes.
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric W. Biederman [Wed, 13 May 2009 16:55:10 +0000 (16:55 +0000)]
syscall: Implement a convinience function restart_syscall
Currently when we have a signal pending we have the functionality
to restart that the current system call. There are other cases
such as nasty lock ordering issues where it makes sense to have
a simple fix that uses try lock and restarts the system call.
Buying time to figure out how to rework the locking strategy.
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johann Baudy [Tue, 19 May 2009 05:11:22 +0000 (22:11 -0700)]
net: TX_RING and packet mmap
New packet socket feature that makes packet socket more efficient for
transmission.
- It reduces number of system call through a PACKET_TX_RING mechanism,
based on PACKET_RX_RING (Circular buffer allocated in kernel space
which is mmapped from user space).
- It minimizes CPU copy using fragmented SKB (almost zero copy).
Signed-off-by: Johann Baudy <johann.baudy@gnu-log.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dhananjay Phadke [Tue, 19 May 2009 04:46:40 +0000 (21:46 -0700)]
netxen: fix msi irq setup
The pdev->irq was not saved in netxen_adapter, causing request_irq()
with invalid irq number.
This was broken in commit
be339aee634d5cb98a8df8d6febe04002ec497f3
("netxen: fix irq tear down and msix leak.").
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 19 May 2009 04:08:20 +0000 (21:08 -0700)]
Merge branch 'master' of /linux/kernel/git/davem/net-2.6
Conflicts:
drivers/scsi/fcoe/fcoe.c
Eric Dumazet [Tue, 19 May 2009 02:26:37 +0000 (19:26 -0700)]
pkt_sched: gen_estimator: use 64 bit intermediate counters for bps
gen_estimator can overflow bps (bytes per second) with Gb links, while
it was designed with a u32 API, with a theorical limit of 34360Mbit
(2^32 bytes)
Using 64 bit intermediate avbps/brate counters can allow us to reach
this theorical limit.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wolfgang Grandegger [Fri, 15 May 2009 23:39:33 +0000 (23:39 +0000)]
The patch adds support for the PCI cards: PCIcan and PCIcanx (1, 2 or 4 channel) from Kvaser (kvaser.com).
Signed-off-by: Per Dalen <per.dalen@cnw.se>
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wolfgang Grandegger [Fri, 15 May 2009 23:39:32 +0000 (23:39 +0000)]
can: SJA1000 driver for EMS PCI cards
The patch adds support for the one or two channel CPC-PCI and CPC-PCIe
cards from EMS Dr. Thomas Wuensche (http://www.ems-wuensche.de).
Signed-off-by: Sebastian Haas <haas@ems-wuensche.com>
Signed-off-by: Markus Plessing <plessing@ems-wuensche.com>
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wolfgang Grandegger [Fri, 15 May 2009 23:39:31 +0000 (23:39 +0000)]
can: SJA1000 generic platform bus driver
This driver adds support for the SJA1000 chips connected to the
"platform bus", which can be found on various embedded systems.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wolfgang Grandegger [Fri, 15 May 2009 23:39:30 +0000 (23:39 +0000)]
can: Driver for the SJA1000 CAN controller
This patch adds the generic Socket-CAN driver for the Philips SJA1000
full CAN controller.
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wolfgang Grandegger [Fri, 15 May 2009 23:39:29 +0000 (23:39 +0000)]
can: CAN Network device driver and Netlink interface
The CAN network device driver interface provides a generic interface to
setup, configure and monitor CAN network devices. It exports a set of
common data structures and functions, which all real CAN network device
drivers should use. Please have a look to the SJA1000 or MSCAN driver
to understand how to use them. The name of the module is can-dev.ko.
Furthermore, it adds a Netlink interface allowing to configure the CAN
device using the program "ip" from the iproute2 utility suite.
For further information please check "Documentation/networking/can.txt"
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wolfgang Grandegger [Fri, 15 May 2009 23:39:28 +0000 (23:39 +0000)]
can: Update MAINTAINERS and CREDITS file
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wolfgang Grandegger [Fri, 15 May 2009 23:39:27 +0000 (23:39 +0000)]
can: Documentation for the CAN device driver interface
This patch documents the CAN netowrk device drivers interface, removes
obsolete documentation and adds some useful links to CAN resources.
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ajit Khaparde [Mon, 18 May 2009 22:38:55 +0000 (15:38 -0700)]
be2net: add two new pci device ids to pci device table
Signed-off-by: Ajit Khaparde <ajitk@serverengines.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Anant Gole [Mon, 18 May 2009 22:19:01 +0000 (15:19 -0700)]
net: Add TI DaVinci EMAC driver
Add support for TI DaVinci EMAC driver.
TI DaVinci Ethernet Media Access Controller module is based upon
TI CPPI 3.0 DMA engine and supports 10/100 Mbps on all and Gigabit modes on
some TI devices. It supports MII/RMII and has up to 8Kbytes of internal
descriptor memory. This driver has been working on several TI devices including
DM644x, DM646x and DA830 platforms. The specs of this device are available at:
http://www.ti.com/litv/pdf/sprue24a
Signed-off-by: Anant Gole <anantgole@ti.com>
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Signed-off-by: Chaithrika U S <chaithrika@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rami Rosen [Mon, 18 May 2009 01:47:52 +0000 (01:47 +0000)]
ipv4: cleanup: remove unnecessary include.
There is no need for net/icmp.h header in net/ipv4/fib_frontend.c.
This patch removes the #include net/icmp.h from it.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rami Rosen [Mon, 18 May 2009 01:19:12 +0000 (01:19 +0000)]
ipv4: cleanup - remove two unused parameters from fib_semantic_match().
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 18 May 2009 00:35:38 +0000 (00:35 +0000)]
vlan: use struct netdev_queue counters instead of dev->stats
We can update netdev_queue tx_bytes/tx_packets/tx_dropped counters instead
of dev->stats ones, to reduce number of cache lines dirtied in xmit path.
This fixes a performance problem on SMP when many different cpus take
vlan tx path.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 18 May 2009 00:34:33 +0000 (00:34 +0000)]
net: add tx_packets/tx_bytes/tx_dropped counters in struct netdev_queue
offsetof(struct net_device, features)=0x44
offsetof(struct net_device, stats.tx_packets)=0x54
offsetof(struct net_device, stats.tx_bytes)=0x5c
offsetof(struct net_device, stats.tx_dropped)=0x6c
Network drivers that touch dev->stats.tx_packets/stats.tx_bytes in their
tx path can slow down SMP operations, since they dirty a cache line
that should stay shared (dev->features is needed in rx and tx paths)
We could move away stats field in net_device but it wont help that much.
(Two cache lines dirtied in tx path, we can do one only)
Better solution is to add tx_packets/tx_bytes/tx_dropped in struct
netdev_queue because this structure is already touched in tx path and
counters updates will then be free (no increase in size)
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 18 May 2009 22:12:31 +0000 (15:12 -0700)]
sch_teql: should not dereference skb after ndo_start_xmit()
It is illegal to dereference a skb after a successful ndo_start_xmit()
call. We must store skb length in a local variable instead.
Bug was introduced in 2.6.27 by commit
0abf77e55a2459aa9905be4b226e4729d5b4f0cb
(net_sched: Add accessor function for packet length for qdiscs)
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter P Waskiewicz Jr [Sun, 17 May 2009 20:58:21 +0000 (20:58 +0000)]
ixgbe: Increase the driver version number
Marching along, let's bump the version number to indicate things actually
have happened to the driver.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter P Waskiewicz Jr [Sun, 17 May 2009 20:58:04 +0000 (20:58 +0000)]
ixgbe: Add generic XAUI support to 82599
This patch adds the generic XAUI device support for 82599 controllers.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Sun, 17 May 2009 20:57:47 +0000 (20:57 +0000)]
ixgbe: set max desc to prevent total RSC packet size of 64K
The performance of hardware RSC is greatly reduced if the total for max rsc
descriptors multiplied by the buffer size is greater than 65535. To
prevent this we need to adjust the max rsc descriptors appropriately.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ilpo Järvinen [Sun, 10 May 2009 20:32:34 +0000 (20:32 +0000)]
tcp: fix MSG_PEEK race check
Commit
518a09ef11 (tcp: Fix recvmsg MSG_PEEK influence of
blocking behavior) lets the loop run longer than the race check
did previously expect, so we need to be more careful with this
check and consider the work we have been doing.
I tried my best to deal with urg hole madness too which happens
here:
if (!sock_flag(sk, SOCK_URGINLINE)) {
++*seq;
...
by using additional offset by one but I certainly have very
little interest in testing that part.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Tested-by: Frans Pop <elendil@planet.nl>
Tested-by: Ian Zimmermann <itz@buug.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 18 May 2009 21:48:30 +0000 (14:48 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-next-2.6
Wang Tinggong [Thu, 14 May 2009 22:49:36 +0000 (22:49 +0000)]
Doc: fixed descriptions on /proc/sys/net/core/* and /proc/sys/net/unix/*
Signed-off-by: Wang Tinggong <wangtinggong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
roel kluin [Fri, 15 May 2009 10:19:51 +0000 (10:19 +0000)]
Neterion: *FIFO1_DMA_ERR set twice, should 2nd be *FIFO2_DMA_ERR?
FIFO1_DMA_ERR is set twice, the second should be FIFO2_DMA_ERR.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Ram Vepa <ram.vepa@neterion.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gabriel Paubert [Mon, 18 May 2009 04:16:47 +0000 (21:16 -0700)]
mv643xx_eth: fix PPC DMA breakage
After 2.6.29, PPC no more admits passing NULL to the dev parameter of
the DMA API. The result is a BUG followed by solid lock-up when the
mv643xx_eth driver brings an interface up. The following patch makes
the driver work on my Pegasos again; it is mostly a search and replace
of NULL by mp->dev->dev.parent in dma allocation/freeing/mapping/unmapping
functions.
Signed-off-by: Gabriel Paubert <paubert@iram.es>
Acked-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Fri, 15 May 2009 08:44:32 +0000 (08:44 +0000)]
bonding: fix link down handling in 802.3ad mode
One of the purposes of bonding is to allow for redundant links, and failover
correctly if the cable is pulled. If all the members of a bonded device have
no carrier present, the bonded device itself needs to report no carrier present
to user space so management tools (like routing daemons) can respond.
Bonding in 802.3ad mode does not work correctly for this because it incorrectly
chooses a link that is down as a possible aggregator.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 18 May 2009 04:14:33 +0000 (21:14 -0700)]
Merge branch 'linux-2.6.30.y' of git://git./linux/kernel/git/inaky/wimax
Stephen Hemminger [Fri, 15 May 2009 06:11:58 +0000 (06:11 +0000)]
bridge: fix initial packet flood if !STP
If bridge is configured with no STP and forwarding delay of 0 (which
is typical for virtualization) then when link starts it will flood all
packets for the first 20 seconds.
This bug was introduced by a combination of earlier changes:
* forwarding database uses hold time of zero to indicate
user wants to always flood packets
* optimzation of the case of forwarding delay of 0 avoids the initial
timer tick
The fix is to just skip all the topology change detection code if
kernel STP is not being used.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Fri, 15 May 2009 06:10:13 +0000 (06:10 +0000)]
bridge: relay bridge multicast pkgs if !STP
Currently the bridge catches all STP packets; even if STP is turned
off. This prevents other systems (which do have STP turned on)
from being able to detect loops in the network.
With this patch, if STP is off, then any packet sent to the STP
multicast group address is forwarded to all ports.
Based on earlier patch by Joakim Tjernlund with changes
to go through forwarding (not local chain), and optimization
that only last octet needs to be checked.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ralf Baechle [Sat, 16 May 2009 01:21:58 +0000 (01:21 +0000)]
NET: Meth: Fix unsafe mix of irq and non-irq spinlocks.
Mixing of normal and irq spinlocks results in the following lockdep messages
on bootup on IP32:
[...]
Sending DHCP requests .
======================================================
[ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ]
2.6.30-rc5-00164-g41baeef #30
------------------------------------------------------
swapper/1 [HC0[0]:SC0[1]:HE0:SE0] is trying to acquire:
(&priv->meth_lock){+.+...}, at: [<
ffffffff8026388c>] meth_tx+0x48/0x43c
and this task is already holding:
(_xmit_ETHER#2){+.-...}, at: [<
ffffffff802d3a00>] __qdisc_run+0x118/0x30c
which would create a new lock dependency:
(_xmit_ETHER#2){+.-...} -> (&priv->meth_lock){+.+...}
but this new dependency connects a SOFTIRQ-irq-safe lock:
(_xmit_ETHER#2){+.-...}
... which became SOFTIRQ-irq-safe at:
[<
ffffffff80061458>] __lock_acquire+0x784/0x1a14
[<
ffffffff800627e0>] lock_acquire+0xf8/0x150
[<
ffffffff800128d0>] _spin_lock+0x30/0x44
[<
ffffffff802d2b88>] dev_watchdog+0x70/0x398
[<
ffffffff800433b8>] run_timer_softirq+0x1a8/0x248
[<
ffffffff8003da5c>] __do_softirq+0xec/0x208
[<
ffffffff8003dbd8>] do_softirq+0x60/0xe4
[<
ffffffff8003dda0>] irq_exit+0x54/0x9c
[<
ffffffff80004420>] ret_from_irq+0x0/0x4
[<
ffffffff80004720>] r4k_wait+0x20/0x40
[<
ffffffff80015418>] cpu_idle+0x30/0x60
[<
ffffffff804cd934>] start_kernel+0x3ec/0x404
to a SOFTIRQ-irq-unsafe lock:
(&priv->meth_lock){+.+...}
... which became SOFTIRQ-irq-unsafe at:
... [<
ffffffff800614f8>] __lock_acquire+0x824/0x1a14
[<
ffffffff800627e0>] lock_acquire+0xf8/0x150
[<
ffffffff800128d0>] _spin_lock+0x30/0x44
[<
ffffffff80263f20>] meth_reset+0x118/0x2d8
[<
ffffffff8026424c>] meth_open+0x28/0x140
[<
ffffffff802c1ae8>] dev_open+0xe0/0x18c
[<
ffffffff802c1268>] dev_change_flags+0xd8/0x1d4
[<
ffffffff804e7770>] ip_auto_config+0x1d4/0xf28
[<
ffffffff80012e68>] do_one_initcall+0x58/0x170
[<
ffffffff804cd190>] kernel_init+0x98/0x104
[<
ffffffff8001520c>] kernel_thread_helper+0x10/0x18
other info that might help us debug this:
2 locks held by swapper/1:
#0: (rcu_read_lock){.+.+..}, at: [<
ffffffff802c0954>] dev_queue_xmit+0x1e0/0x4b0
#1: (_xmit_ETHER#2){+.-...}, at: [<
ffffffff802d3a00>] __qdisc_run+0x118/0x30c
the SOFTIRQ-irq-safe lock's dependencies:
-> (_xmit_ETHER#2){+.-...} ops: 0 {
HARDIRQ-ON-W at:
[<
ffffffff800614d0>] __lock_acquire+0x7fc/0x1a14
[<
ffffffff800627e0>] lock_acquire+0xf8/0x150
[<
ffffffff800128d0>] _spin_lock+0x30/0x44
[<
ffffffff802d2b88>] dev_watchdog+0x70/0x398
[<
ffffffff800433b8>] run_timer_softirq+0x1a8/0x248
[<
ffffffff8003da5c>] __do_softirq+0xec/0x208
[<
ffffffff8003dbd8>] do_softirq+0x60/0xe4
[<
ffffffff8003dda0>] irq_exit+0x54/0x9c
[<
ffffffff80004420>] ret_from_irq+0x0/0x4
[<
ffffffff80004720>] r4k_wait+0x20/0x40
[<
ffffffff80015418>] cpu_idle+0x30/0x60
[<
ffffffff804cd934>] start_kernel+0x3ec/0x404
IN-SOFTIRQ-W at:
[<
ffffffff80061458>] __lock_acquire+0x784/0x1a14
[<
ffffffff800627e0>] lock_acquire+0xf8/0x150
[<
ffffffff800128d0>] _spin_lock+0x30/0x44
[<
ffffffff802d2b88>] dev_watchdog+0x70/0x398
[<
ffffffff800433b8>] run_timer_softirq+0x1a8/0x248
[<
ffffffff8003da5c>] __do_softirq+0xec/0x208
[<
ffffffff8003dbd8>] do_softirq+0x60/0xe4
[<
ffffffff8003dda0>] irq_exit+0x54/0x9c
[<
ffffffff80004420>] ret_from_irq+0x0/0x4
[<
ffffffff80004720>] r4k_wait+0x20/0x40
[<
ffffffff80015418>] cpu_idle+0x30/0x60
[<
ffffffff804cd934>] start_kernel+0x3ec/0x404
INITIAL USE at:
[<
ffffffff80061570>] __lock_acquire+0x89c/0x1a14
[<
ffffffff800627e0>] lock_acquire+0xf8/0x150
[<
ffffffff800128d0>] _spin_lock+0x30/0x44
[<
ffffffff802d2b88>] dev_watchdog+0x70/0x398
[<
ffffffff800433b8>] run_timer_softirq+0x1a8/0x248
[<
ffffffff8003da5c>] __do_softirq+0xec/0x208
[<
ffffffff8003dbd8>] do_softirq+0x60/0xe4
[<
ffffffff8003dda0>] irq_exit+0x54/0x9c
[<
ffffffff80004420>] ret_from_irq+0x0/0x4
[<
ffffffff80004720>] r4k_wait+0x20/0x40
[<
ffffffff80015418>] cpu_idle+0x30/0x60
[<
ffffffff804cd934>] start_kernel+0x3ec/0x404
}
... key at: [<
ffffffff80cf93f0>] netdev_xmit_lock_key+0x8/0x1c8
the SOFTIRQ-irq-unsafe lock's dependencies:
-> (&priv->meth_lock){+.+...} ops: 0 {
HARDIRQ-ON-W at:
[<
ffffffff800614d0>] __lock_acquire+0x7fc/0x1a14
[<
ffffffff800627e0>] lock_acquire+0xf8/0x150
[<
ffffffff800128d0>] _spin_lock+0x30/0x44
[<
ffffffff80263f20>] meth_reset+0x118/0x2d8
[<
ffffffff8026424c>] meth_open+0x28/0x140
[<
ffffffff802c1ae8>] dev_open+0xe0/0x18c
[<
ffffffff802c1268>] dev_change_flags+0xd8/0x1d4
[<
ffffffff804e7770>] ip_auto_config+0x1d4/0xf28
[<
ffffffff80012e68>] do_one_initcall+0x58/0x170
[<
ffffffff804cd190>] kernel_init+0x98/0x104
[<
ffffffff8001520c>] kernel_thread_helper+0x10/0x18
SOFTIRQ-ON-W at:
[<
ffffffff800614f8>] __lock_acquire+0x824/0x1a14
[<
ffffffff800627e0>] lock_acquire+0xf8/0x150
[<
ffffffff800128d0>] _spin_lock+0x30/0x44
[<
ffffffff80263f20>] meth_reset+0x118/0x2d8
[<
ffffffff8026424c>] meth_open+0x28/0x140
[<
ffffffff802c1ae8>] dev_open+0xe0/0x18c
[<
ffffffff802c1268>] dev_change_flags+0xd8/0x1d4
[<
ffffffff804e7770>] ip_auto_config+0x1d4/0xf28
[<
ffffffff80012e68>] do_one_initcall+0x58/0x170
[<
ffffffff804cd190>] kernel_init+0x98/0x104
[<
ffffffff8001520c>] kernel_thread_helper+0x10/0x18
INITIAL USE at:
[<
ffffffff80061570>] __lock_acquire+0x89c/0x1a14
[<
ffffffff800627e0>] lock_acquire+0xf8/0x150
[<
ffffffff800128d0>] _spin_lock+0x30/0x44
[<
ffffffff80263f20>] meth_reset+0x118/0x2d8
[<
ffffffff8026424c>] meth_open+0x28/0x140
[<
ffffffff802c1ae8>] dev_open+0xe0/0x18c
[<
ffffffff802c1268>] dev_change_flags+0xd8/0x1d4
[<
ffffffff804e7770>] ip_auto_config+0x1d4/0xf28
[<
ffffffff80012e68>] do_one_initcall+0x58/0x170
[<
ffffffff804cd190>] kernel_init+0x98/0x104
[<
ffffffff8001520c>] kernel_thread_helper+0x10/0x18
}
... key at: [<
ffffffff80cf6ce8>] __key.32424+0x0/0x8
stack backtrace:
Call Trace:
[<
ffffffff8000ed0c>] dump_stack+0x8/0x34
[<
ffffffff80060b74>] check_usage+0x470/0x4a0
[<
ffffffff80060c34>] check_irq_usage+0x90/0x130
[<
ffffffff80061f78>] __lock_acquire+0x12a4/0x1a14
[<
ffffffff800627e0>] lock_acquire+0xf8/0x150
[<
ffffffff80012a0c>] _spin_lock_irqsave+0x60/0x84
[<
ffffffff8026388c>] meth_tx+0x48/0x43c
[<
ffffffff802d3a38>] __qdisc_run+0x150/0x30c
[<
ffffffff802c0aa8>] dev_queue_xmit+0x334/0x4b0
[<
ffffffff804e7e6c>] ip_auto_config+0x8d0/0xf28
[<
ffffffff80012e68>] do_one_initcall+0x58/0x170
[<
ffffffff804cd190>] kernel_init+0x98/0x104
[<
ffffffff8001520c>] kernel_thread_helper+0x10/0x18
..... timed out!
IP-Config: Retrying forever (NFS root)...
Sending DHCP requests ., OK
[...]
Fixed by converting all locks to irq locks.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Tested-by: Andrew Randrianasulu <randrik_a@yahoo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter P Waskiewicz Jr [Sun, 17 May 2009 12:35:57 +0000 (12:35 +0000)]
ixgbe: Don't reset the hardware when switching between LFC and PFC
When running in DCB mode, switching between link flow control and priority
flow control shouldn't need to reset the hardware. This removes that
reset.
This also extends the set_all() dcbnl callback to return a value indicating
that the HW config changed, however a reset was not required.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter P Waskiewicz Jr [Sun, 17 May 2009 12:35:36 +0000 (12:35 +0000)]
ixgbe: When in DCB mode with PFC enabled, show LFC is disabled
Ethtool should report that link flow control is disabled when in priority
flow control mode.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter P Waskiewicz Jr [Sun, 17 May 2009 12:35:16 +0000 (12:35 +0000)]
ixgbe: Allow link flow control in DCB mode for 82599 adapters
82599 supports using either link flow control or priority flow control when
in DCB mode. The dcbnl interface already supports sending down
configurations through rtnetlink that can enable LFC when DCB is enabled,
so the driver should take advantage of this.
82598 does not support using LFC when DCB is enabled, so explicitly disable
it when we're in DCB mode. This means we always run in PFC mode when DCB
is enabled.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter P Waskiewicz Jr [Sun, 17 May 2009 12:34:55 +0000 (12:34 +0000)]
ixgbe: Set Priority Flow Control low water threshhold for DCB
This sets the low water threshhold for priority flow control for 82598
and 82599 controllers in DCB mode.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yi Zou [Sun, 17 May 2009 12:34:35 +0000 (12:34 +0000)]
ixgbe: Enable jumbo frame for FCoE feature in 82599
Enable jumbo frame when FCoE feature is enabled in 82599. Use 3K
as the receive queue buffer size for receive queues used by FCoE
to address for max Fiber Channel frame size as 2148 bytes (with
max 2112 bytes of payload).
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yi Zou [Sun, 17 May 2009 12:34:14 +0000 (12:34 +0000)]
ixgbe: Enable FCoE redirection table feature in 82599
Enable using FCoE redirection table feature in 82599. The FCoE
redirection table has maximum of eight entries, corresponding
to maximum of eight receive queues to be used for distributing
incoming FCoE packets. This patch sets up the FCoE redirection
table when multiple receive queues are available for FCoE.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yi Zou [Sun, 17 May 2009 12:33:52 +0000 (12:33 +0000)]
ixgbe: Add RING_F_FCOE for FCoE feature in 82599
Add ring feature for FCoE to make use of the FCoE redirection
table in 82599. The FCoE redirection table is a receive side
scaling feature for Fiber Channel over Ethernet feature in 82599,
enabling distributing FCoE packets to different receive queues
based on the exchange id.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasu Dev [Sun, 17 May 2009 12:33:28 +0000 (12:33 +0000)]
fcoe: adds spma mode support
If we can find a type NETDEV_HW_ADDR_T_SAN mac address from the
corresponding netdev for a fcoe interface then sets up added the
fc->ctlr.spma flag and stores spma mode address in ctl_src_addr.
In case the spma flag is set then:-
1. Adds spma mode MAC address in ctl_src_addr as secondary
MAC address, the FLOGI for FIP and pre-FIP will go out
using this address.
2. Cleans up stored spma MAC address in ctl_src_addr in
fcoe_netdev_cleanup.
3. Sets up spma bit in fip_flags for FIP solicitations along
with exiting FPMA bit setting.
4. Initialize the FLOGI FIP MAC descriptor to stored spma
MAC address in ctl_src_addr. This is used as proposed
FCoE MAC address from initiator along with both SPMA
and FPMA bit set in FIP solicitation, in response the
switch may grant any FPMA or SPMA mode MAC address to
initiator.
Removes FIP descriptor type checking against ELS type
ELS_FLOGI in fcoe_ctlr_encaps to update a FIP MAC descriptor,
instead now checks against FIP_DT_FLOGI.
I've tested this with available FPMA-only FCoE switch but
since data_src_addr is updated using same old code for
both FPMA and SPMA modes with FIP or pre-FIP links, so added
SPMA mode will work with SPMA-only switch also provided that
switch grants a valid MAC address.
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasu Dev [Sun, 17 May 2009 12:33:08 +0000 (12:33 +0000)]
fcoe: consolidates netdev related config and cleanup for spma mode
Currently fcoe_netdev_config adds netdev pkt handler for fcoe pkts,
fcoe_if_create adds netdev pkt handler for fip packets, a secondary
MAC address is added by fcoe_netdev_config and then later cleanup
for these netdev related config/adds is done only during
fcoe_if_destroy and no cleanup done on error during fcoe interface
creation after above netdev config calling in fcoe_if_create.
So this patch adds single func for above mentioned cleanup the
fcoe_netdev_cleanup and then calls this func on either fcoe interface
destroy or exiting from fcoe_if_create due to an error after fcoe/fip
related above netdev config is done.
Moved netdev pkt handler addition code blocks for fip pkts close to
similar code block for foce pkt in fcoe_netdev_config, so that added
fcoe_netdev_cleanup could be called on error from fcoe_netdev_config
to undo these both additions for fcoe/fip pkt handlers. This move
required reference to fcoe_fip_recv in fcoe_netdev_config, so moved
fip related functions fcoe_fip_recv, fcoe_fip_send and
fcoe_update_src_mac above fcoe_netdev_config.
This consolidation will enable spma mode support in next patch to
easily add or delete spma mode mac address beside fixing current
no cleanup issue during error.
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Waskiewicz Jr, Peter P [Sun, 17 May 2009 12:32:48 +0000 (12:32 +0000)]
ixgbe: Add SAN MAC address to the RAR, return the address to DCB
After acquiring the SAN MAC address from the EEPROM, we need to program it
into one of the RARs. Also, DCB will use this MAC address to run DCBX
commands, so it doesn't have to play musical MAC addresses when things like
bonding enter the picture. So we need to return the MAC address through
the netlink interface to userspace.
This also moves the init_rx_addrs() call out of start_hw() and into
reset_hw(). We shouldn't try to read any of the RAR information before
initializing our internal accounting of the RAR table, which was what
was happening.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PJ Waskiewicz [Sun, 17 May 2009 12:32:25 +0000 (12:32 +0000)]
ixgbe: Add FCoE Storage MAC Address support
This patch implements the Storage Address entrypoint from the net device.
It will read the SAN MAC addresses from the EEPROM of the 82599 hardware,
and make them available to the FCoE stack through the net device.
Also, add/del the SAN MAC address to the netdev dev_addr_list via the
kernel api dev_addr_add()/dev_addr_del() when there is a valid SAN MAC
supported by the HW.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Beregalov [Fri, 15 May 2009 10:22:42 +0000 (10:22 +0000)]
skfddi: convert PRINTK() to pr_debug()
Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Beregalov [Fri, 15 May 2009 10:22:41 +0000 (10:22 +0000)]
mac89x0: remove PRINTK()
There are no users of it.
Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Beregalov [Fri, 15 May 2009 10:22:40 +0000 (10:22 +0000)]
de620: convert PRINTK() to pr_debug() and cleanup
Also remove DE620_DEBUG and de620_debug.
Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Beregalov [Fri, 15 May 2009 10:22:39 +0000 (10:22 +0000)]
de600: convert PRINTK() to pr_debug()
Also remove de600_debug as it is not needed.
Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Beregalov [Fri, 15 May 2009 10:22:38 +0000 (10:22 +0000)]
de620: fix forgotten semicolon
It seems it always was here.
Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Fri, 15 May 2009 07:59:42 +0000 (07:59 +0000)]
drivers/net: Convert #ifdef DEBUG printk(KERN_DEBUG to pr_debug(
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Fri, 15 May 2009 06:06:16 +0000 (06:06 +0000)]
sfc: Use generic XENPAK register definitions
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Fri, 15 May 2009 06:05:49 +0000 (06:05 +0000)]
mdio: Add XENPAK LASI register definitions
These registers were originally defined for XENPAK modules, but are
also implemented by many other 10G PHYs.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Fri, 15 May 2009 06:04:12 +0000 (06:04 +0000)]
mdio: Add 10GBASE-T SNR register definition
These do not have an in-kernel user but may be useful to user-space.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Korsgaard [Wed, 13 May 2009 10:10:41 +0000 (10:10 +0000)]
smsc95xx: strip ethernet fcs (crc) on receive path
The smsc95xx driver was forwarding the trailing fcs on received frames
up the stack leading to confusion in tcpdump.
Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk>
Tested-by: Steve Glendinning <steve.glendinning@smsc.com>
Acked-by: Steve Glendinning <steve.glendinning@smsc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Korsgaard [Wed, 13 May 2009 10:09:53 +0000 (10:09 +0000)]
dm9601: trivial comment fixes
The comments describing the rx/tx headers used a combination of zero-
and 1-based indexing, leading to confusion.
Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 18 May 2009 03:55:16 +0000 (20:55 -0700)]
net: tx scalability works : trans_start
struct net_device trans_start field is a hot spot on SMP and high performance
devices, particularly multi queues ones, because every transmitter dirties
it. Is main use is tx watchdog and bonding alive checks.
But as most devices dont use NETIF_F_LLTX, we have to lock
a netdev_queue before calling their ndo_start_xmit(). So it makes
sense to move trans_start from net_device to netdev_queue. Its update
will occur on a already present (and in exclusive state) cache line, for
free.
We can do this transition smoothly. An old driver continue to
update dev->trans_start, while an updated one updates txq->trans_start.
Further patches could also put tx_bytes/tx_packets counters in
netdev_queue to avoid dirtying dev->stats (vlan device comes to mind)
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tilman Schmidt [Wed, 13 May 2009 12:44:18 +0000 (12:44 +0000)]
gigaset: remove unused structure member rcvbytes
The B channel data structure member rcvbytes was never set to
anything else but zero, so drop it.
Impact: cleanup
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tilman Schmidt [Wed, 13 May 2009 12:44:18 +0000 (12:44 +0000)]
gigaset: remove UNDOCREQ config option
Drop the kernel config option GIGASET_UNDOCREQ, permanently
activating the code it controlled, as there have been no reports
of problems caused by its activation but many problems caused by
it being disabled.
Also fix a few bad comments while we're at it.
Impact: cleanup
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tilman Schmidt [Wed, 13 May 2009 12:44:18 +0000 (12:44 +0000)]
gigaset: move up Kconfig inclusion point
In preparation for porting to kernel CAPI subsystem, include the
Gigaset driver's Kconfig directly from ISDN's instead of I4L's.
Impact: Kconfig reorganisation, no functional change
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tilman Schmidt [Wed, 13 May 2009 12:44:17 +0000 (12:44 +0000)]
gigaset: documentation update
Mention handling of unregisteted DECT wireless datasets in README.gigaset.
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tilman Schmidt [Wed, 13 May 2009 12:44:17 +0000 (12:44 +0000)]
gigaset: fix error return code
gigaset_register_to_LL() is expected to print a message and return 0
on failure. Make it do so consistently.
Impact: error handling bugfix
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tilman Schmidt [Wed, 13 May 2009 12:44:17 +0000 (12:44 +0000)]
gigaset: skip unnecessary hex formatting
Don't generate the hex representation of the payload data if it
isn't actually used afterwards.
Impact: optimization
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tilman Schmidt [Wed, 13 May 2009 12:44:17 +0000 (12:44 +0000)]
gigaset: fix possible oops in error handling
Use pr_warning() / pr_err() instead of dev_warn() / dev_err() in two
places where the dev pointer isn't guaranteed to be valid.
Impact: error handling bugfix
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tilman Schmidt [Wed, 13 May 2009 12:44:17 +0000 (12:44 +0000)]
gigaset: remove obsolete references to m10x state table
The separation of state tables for base and M10x has long been
removed. Clean up remaining traces of it.
Impact: cleanup
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yevgeny Petrilin [Mon, 18 May 2009 03:48:59 +0000 (20:48 -0700)]
mlx4_en: Fix not deleted napi structures
Napi structures are being created each time we open a port, but when
the port is closed the napi structure is only disabled but not removed.
This bug caused hang while removing the driver.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 12 May 2009 20:48:02 +0000 (20:48 +0000)]
bnx2: bnx2_tx_int() optimizations
When using bnx2 in a high transmit load, bnx2_tx_int() cost is pretty high.
There are two reasons.
One is an expensive call to bnx2_get_hw_tx_cons(bnapi) for each freed skb
One is cpu stalls when accessing skb_is_gso(skb) / skb_shinfo(skb)->nr_frags
because of two cache line misses.
(One to get skb->end/head to compute skb_shinfo(skb),
one to get is_gso/nr_frags)
This patch :
1) avoids calling bnx2_get_hw_tx_cons(bnapi) too many times.
2) makes bnx2_start_xmit() cache is_gso & nr_frags into sw_tx_bd descriptor.
This uses a litle bit more ram (256 longs per device on x86), but helps a lot.
3) uses a prefetch(&skb->end) to speedup dev_kfree_skb(), bringing
cache line that will be needed in skb_release_data()
result is 5 % bandwidth increase in benchmarks, involving UDP or TCP receive
& transmits, when a cpu is dedicated to ksoftirqd for bnx2.
bnx2_tx_int going from 3.33 % cpu to 0.5 % cpu in oprofile
Note : skb_dma_unmap() still very expensive but this is for another patch,
not related to bnx2 (2.9 % of cpu, while it does nothing on x86_32)
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Dykstra [Tue, 12 May 2009 15:34:50 +0000 (15:34 +0000)]
tcp: tcp_prequeue() can use keyed wakeups
When TCP frees up write buffer space, avoid waking up tasks that have
done a poll() or select() on the same socket specifying read-side
events.
This is an extension of a read-side patch by Eric Dumazet.
Signed-off-by: John Dykstra <john.dykstra1@gmail.com>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chris Friesen [Mon, 18 May 2009 03:39:33 +0000 (20:39 -0700)]
ipconfig: handle case of delayed DHCP server
If a DHCP server is delayed, it's possible for the client to receive the
DHCPOFFER after it has already sent out a new DHCPDISCOVER message from
a second interface. The client then sends out a DHCPREQUEST from the
second interface, but the server doesn't recognize the device and
rejects the request.
This patch simply tracks the current device being configured and throws
away the OFFER if it is not intended for the current device. A more
sophisticated approach would be to put the OFFER information into the
struct ic_device rather than storing it globally.
Signed-off-by: Chris Friesen <cfriesen@nortel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Mon, 11 May 2009 00:36:35 +0000 (00:36 +0000)]
netpoll: don't dereference NULL dev from np
It looks like the dev in netpoll_poll can be NULL - at lease it's
checked at the function beginning. Thus the dev->netde_ops dereference
looks dangerous.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Rothwell [Mon, 11 May 2009 21:44:51 +0000 (21:44 +0000)]
net/ibmveth: fix panic in probe
netdev->dev_addr changed from being an array to being a pointer, so we
should not take its address for memcpy().
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yi Zou [Wed, 13 May 2009 13:12:16 +0000 (13:12 +0000)]
ixgbe: Add FCoE related statistics to 82599
This adds FCoE related statistics to 82599, including number Rx-ed and Tx-ed
FCoE packets, number of Rx-ed and Tx-ed FCoE packets in dwords, number of bad
Fiber Channel CRCs detected in FCoE packets, and number of FCoE packets dropped
on the Rx side.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yi Zou [Wed, 13 May 2009 13:11:53 +0000 (13:11 +0000)]
ixgbe: Implement FCoE Rx side large receive offload feature to 82599
This patch implements the FCoE Rx side offload feature in ixgbe_main.c
to 82599 using the Rx offload infrastructure code added in the previous
patch. The large receive offload by Direct Data Placement (DDP) for
FCoE is achieved by implementing the ndo_fcoe_ddp_setup and ndo_fcoe_ddp_done
in net_device_ops via netdev. It is up to the ULD, i.e., fcoe and libfc
to query and setup large receive offload accordingly through the corresponding
netdev upon creating fcoe instances.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yi Zou [Wed, 13 May 2009 13:11:29 +0000 (13:11 +0000)]
ixgbe: Add infrastructure code for FCoE large receive offload to 82599
This adds infrastructure code for FCoE Rx side offload feature to
82599, which provides large receive offload for FCoE by Direct
Data Placement (DDP). The ixgbe_fcoe_ddp_get() and ixgbe_fcoe_ddp_put()
pair corresponds to the netdev support to FCoE by the function pointers
provided in net_device_ops as ndo_fcoe_ddp_setup and ndo_fcoe_ddp_done.
The implementation of these in ixgbe is shown in the next patch.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yi Zou [Wed, 13 May 2009 13:11:06 +0000 (13:11 +0000)]
ixgbe: Implement FCoE Tx side offload features in base driver of 82599
This patch implements the FCoE Tx side offload features in ixgbe_main.c
to 82599 using the Tx offload infrastructure code added in the previous
patch. This is achieved by the calling the FCoE Sequence Offload (FSO)
function ixgbe_fso() on the transmit path of ixgbe.
This patch also includes an EEPROM check to make sure the NIC we're loading
on is an offload-enabled SKU.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yi Zou [Wed, 13 May 2009 13:10:44 +0000 (13:10 +0000)]
ixgbe: Add infrastructure code for FCoE large send offload to 82599
This adds infrastructure code for FCoE Tx side offload feature to
82599, including Fiber Channel CRC calculation, auto insertion of
the start of frame (SOF) and end of frame (EOF) of FCoE packets,
and large send by FCoE Sequence Offload (FSO).
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yi Zou [Wed, 13 May 2009 13:10:21 +0000 (13:10 +0000)]
ixgbe: Add FCoE feature code to 82599
This adds the FCoE feature code ixgbe_fcoe.c to 82599. For a start, this patch
only adds ixgbe_configure_fcoe() to configure related register for FCoE to 82599.
In patches that follow, I will be adding more functions to ixgbe_fcoe.c to add
support of FCoE offload features to 82599.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yi Zou [Wed, 13 May 2009 13:10:01 +0000 (13:10 +0000)]
ixgbe: Add FCoE feature header to 82599
This adds the FCoE feature header ixgbe_fcoe.h to 82599. This header includes
the defines and structures required by the ixgbe driver to support various
offload features in 82599 for Fiber Channel over Ethernet (FCoE). These offloads
features include Fiber Channel CRC calculation, FCoE SOF/EOF auto insertion,
FCoE Sequence Offload (FSO) for large send, and Direct Data Placement (DDP)
for large receive.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yi Zou [Wed, 13 May 2009 13:09:39 +0000 (13:09 +0000)]
ixgbe: Add FCoE feature register defines to 82599
This adds FCoE related register defines to 82599.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Emil Medve [Wed, 13 May 2009 06:04:58 +0000 (06:04 +0000)]
mv643xx_eth: Remove a stale PPC_MULTIPLATFORM
PPC_MULTIPLATFORM was killed in commit
28794d3 but this stale occurrence was
hiding the mv643xx_eth driver in some cases (e.g. Pegasos II)
Signed-off-by: Emil Medve <Emilian.Medve@Freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Tue, 12 May 2009 22:40:12 +0000 (22:40 +0000)]
net: remove needless (now buggy) & from dev->dev_addr (part2)
Missed part of "&" removal.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Morton [Tue, 12 May 2009 10:48:37 +0000 (10:48 +0000)]
drivers/net/82596.c: suppress warnings
i386 allmodconfig:
drivers/net/82596.c: In function 'init_rx_bufs':
drivers/net/82596.c:544: warning: cast to pointer from integer of different size
drivers/net/82596.c:545: warning: cast to pointer from integer of different size
drivers/net/82596.c:548: warning: cast to pointer from integer of different size
drivers/net/82596.c:557: warning: cast to pointer from integer of different size
drivers/net/82596.c:565: warning: cast to pointer from integer of different size
drivers/net/82596.c:569: warning: cast to pointer from integer of different size
drivers/net/82596.c:575: warning: cast to pointer from integer of different size
drivers/net/82596.c: In function 'rebuild_rx_bufs':
drivers/net/82596.c:606: warning: cast to pointer from integer of different size
drivers/net/82596.c:608: warning: cast to pointer from integer of different size
drivers/net/82596.c: In function 'init_i596_mem':
drivers/net/82596.c:680: warning: cast to pointer from integer of different size
drivers/net/82596.c:681: warning: cast to pointer from integer of different size
drivers/net/82596.c: In function 'i596_rx':
drivers/net/82596.c:818: warning: cast to pointer from integer of different size
drivers/net/82596.c: In function 'i596_add_cmd':
drivers/net/82596.c:975: warning: cast to pointer from integer of different size
drivers/net/82596.c:979: warning: cast to pointer from integer of different size
drivers/net/82596.c: In function 'i596_start_xmit':
drivers/net/82596.c:1088: warning: cast to pointer from integer of different size
drivers/net/82596.c:1099: warning: cast to pointer from integer of different size
drivers/net/82596.c: In function 'i596_interrupt':
drivers/net/82596.c:1404: warning: cast to pointer from integer of different size
(ugh)
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mike Travis [Tue, 12 May 2009 10:48:36 +0000 (10:48 +0000)]
sfc: modify allocation error message
Change error message when alloc_cpumask_var fails.
Repairs "cpumask: convert drivers/net/sfc".
Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li Zefan [Tue, 12 May 2009 10:47:33 +0000 (10:47 +0000)]
cls_cgroup: remove unneeded cgroup_lock
We can remove this lock here, since we are in cgroup write handler and
thus the cgrp is guaranteed to be valid, and no lock is needed when
writing a u32 variable.
Signed-off-by: Li Zefan <lizf@cn.fujitsuc.com>
Acked-by: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Mon, 11 May 2009 23:37:15 +0000 (23:37 +0000)]
net: remove needless (now buggy) & from dev->dev_addr
Patch fixes issues with dev->dev_addr changing from array to pointer.
Hopefully there are no others.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rami Rosen [Mon, 11 May 2009 05:52:49 +0000 (05:52 +0000)]
ipv4: remove an unused parameter from configure method of fib_rules_ops.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 17 May 2009 18:55:57 +0000 (11:55 -0700)]
Merge branch 'master' of /home/davem/src/GIT/linux-2.6/
David S. Miller [Sat, 16 May 2009 20:46:06 +0000 (13:46 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/holtmann/bluetooth-2.6
Linus Torvalds [Sat, 16 May 2009 20:41:28 +0000 (13:41 -0700)]
Fix caller information for warn_slowpath_null
Ian Campbell noticed that since "Eliminate thousands of warnings with
gcc 3.2 build" (commit
57adc4d2dbf968fdbe516359688094eef4d46581) all
WARN_ON()'s currently appear to come from warn_slowpath_null(), eg:
WARNING: at kernel/softirq.c:143 warn_slowpath_null+0x1c/0x20()
because now that warn_slowpath_null() is in the call path, the
__builtin_return_address(0) returns that, rather than the place that
caused the warning.
Fix this by splitting up the warn_slowpath_null/fmt cases differently,
using a common helper function, and getting the return address in the
right place. This also happens to avoid the unnecessary stack usage for
the non-stdargs case, and just generally cleans things up.
Make the function name printout use %pS while at it.
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 16 May 2009 19:47:11 +0000 (12:47 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/bart/ide-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
piix: The Sony TZ90 needs the cable type hardcoding
icside: register second channel of version 6 PCB
ide-tape: remove back-to-back REQUEST_SENSE detection
Linus Torvalds [Sat, 16 May 2009 18:22:06 +0000 (11:22 -0700)]
Merge branch 'release' of git://git./linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
ACPI: Idle C-states disabled by max_cstate should not disable the TSC
ACPI: idle: fix init-time TSC check regression
ACPI processor: reset the throttling state once it's invalid
ACPI processor: introduce module parameter processor.ignore_tpc
ACPI, i915: build fix
ACPI: suspend: restore BM_RLD on resume
ACPI: resume: re-enable SCI-enable workaround
thermal: fix off-by-1 error in trip point trigger condition
eeepc-laptop: unregister_rfkill_notifier on failure
asus-laptop: fix input keycode
eeepc-laptop: support for super hybrid engine (SHE)
eeepc-laptop: Work around rfkill firmware bug
eeepc-laptop: report brightness control events via the input layer
eeepc-laptop: fix wlan rfkill state change during init
ACPI: suspend: don't let device _PS3 failure prevent suspend
ACPI: power: update error message
ACPI: video: DMI workaround another broken Acer BIOS enabling display brightness
ACPICA: use acpi.* modparam namespace
ACPI video: dmi check for broken _BQC on Acer Aspire 5720
Alan Cox [Sat, 16 May 2009 17:03:36 +0000 (19:03 +0200)]
piix: The Sony TZ90 needs the cable type hardcoding
The Sony TZ90 needs the cable type hardcoding. See bug #12734
Signed-off-by: Alan Cox <alan@linux.intel.com>
Reported-by: Jonathan E. Snow <jesnow@uh.edu>
[bart: port it from ata_piix to piix and give reporter the proper credit]
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Sergei Shtylyov [Sat, 16 May 2009 17:03:36 +0000 (19:03 +0200)]
icside: register second channel of version 6 PCB
The second IDE channel of version 6 PCB is not being registered anymore since
the commit
48c3c1072651922ed153bcf0a33ea82cf20df390 (ide: add struct ide_host
(take 3)).
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Tejun Heo [Sat, 18 Apr 2009 22:00:41 +0000 (07:00 +0900)]
ide-tape: remove back-to-back REQUEST_SENSE detection
Impact: fix an oops which always triggers
ide_tape_issue_pc() assumed drive->pc isn't NULL on invocation when
checking for back-to-back request sense issues but drive->pc can be
NULL and even when it's not NULL, it's not safe to dereference it once
the previous command is complete because pc could have been freed or
was on stack. Kill back-to-back REQUEST_SENSE detection.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Len Brown [Sat, 16 May 2009 05:55:59 +0000 (01:55 -0400)]
Merge branches 'release', 'bugzilla-13032', 'bugzilla-13041+', 'bugzilla-13121', 'bugzilla-13165', 'bugzilla-13243', 'bugzilla-13259', 'resume-sci-en-regression', 'thermal-regression', 'tsc-regression' and 'asus-2.6.30' into release