Havard Skinnemoen [Thu, 26 Apr 2012 18:16:00 +0000 (11:16 -0700)]
HID: hiddev: Use vzalloc to allocate hiddev_list
Everytime a HID device is opened, a new hiddev_list is allocated with
kzalloc. This requires 64KB of physically contiguous memory, which could
easily push a heavily loaded system over the edge.
Allocating the same amount of memory with vmalloc shouldn't be nearly as
demanding, so let's do that instead. The memory isn't used for DMA and
doesn't look particularly performance sensitive, so this should be safe.
Signed-off-by: Havard Skinnemoen <hskinnemoen@google.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Benedikt Bergenthal [Wed, 18 Apr 2012 10:22:40 +0000 (12:22 +0200)]
HID: hid-apple: fix a tab width style issue
Fixed a style issue.
Signed-off-by: Benedikt Bergenthal <benedikt@kdrennert.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Josenivaldo Benito Junior [Tue, 10 Apr 2012 22:29:04 +0000 (19:29 -0300)]
HID: Aureal Remote Control Device Driver
Devices like Aureal Cy se W-01RN USB_V3.1 and some derived hardware
have a bogus HID Report Descriptor. According to that report descriptor,
the maximum logical value for key events is 1 and not 101 (101 keys).
This quirk fixes this wrong Report Descriptor.
Signed-off-by: Josenivaldo Benito Junior <jrbenito@benito.qsl.br>
Signed-off-by: Franco Catrin <fcatrin@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Simon Haggett [Tue, 3 Apr 2012 15:04:15 +0000 (16:04 +0100)]
HID: usbhid: Check HID report descriptor contents after device reset
When a USB device reset occurs, usbcore will refetch the device and configuration
descriptors and compare them with those retrieved before the reset to ensure
that they have not changed. For USB HID devices, this implicitly includes the
HID class descriptor (as this is fetched with the configuration descriptor).
However, the HID report descriptor is not checked again.
Whilst a change in the size of the HID report descriptor will be detected (as
this is held in the class descriptor), content changes to the report descriptor
which do not result in a change in its size will be missed. If a firmware update
were applied to a USB HID device which resulted in such a change to the report
descriptor after device reset, then this would not be picked up by usbhid.
This patch fixes this issue by allowing usbhid to check the contents of the
report descriptor after the device reset, and trigger a rebind of the device
if there is a mismatch.
Reviewed-by: Toby Gray <toby.gray@realvnc.com>
Signed-off-by: Simon Haggett <simon.haggett@realvnc.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Jiri Kosina [Fri, 30 Mar 2012 13:26:57 +0000 (15:26 +0200)]
HID: tivo: fix support for bluetooth version of tivo Slide
The device is a bluetooth device, but one occurence by mistake
had marked it as USB.
Reported-by: Joshua Dillon <jvdillon@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Oliver Neukum [Wed, 28 Mar 2012 11:16:19 +0000 (13:16 +0200)]
HID: usbhid: fix error handling of not enough bandwidth
In case IO cannot be started because there is a lack of bandwidth
on the bus, it makes no sense to reset the device. If IO is requested
because the device is opened, user space should be notified with
an error right away. If the lack of bandwidth arises later, for
example after resume, there's no other choice but to retry in the
hope that bandwidth will be freed.
Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Linus Torvalds [Wed, 21 Mar 2012 04:11:42 +0000 (21:11 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jikos/hid
Pull HID updates from Jiri Kosina:
"It contains HID driver updates all over the place -- a lot of new
hardware support especially in the multitouch area, including generic
handling of all multitouch devices by the hid-multitiouch driver
automatically."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (42 commits)
HID: multitouch: add PID for Fructel product
HID: wacom: Add reporting of wheel for Intuos4 WL
HID: wacom: Replace __set_bit with input_set_capability
HID: tivo: add support for BT-version (0x1200)
HID: wacom: Reset stylus buttons - Intuos4 WL
HID: multitouch: detect serial protocol
HID: handle all multitouch devices through hid-multitouch
HID: multitouch: fix handling of buggy reports descriptors for Dell ST2220T
HID: make it possible to force hid-core claim the device
HID: multitouch: add support for eGalax 0x722a
HID: usbhid: add quirk no_get for quanta 3008 devices
HID: multitouch: add more eGalax devices
HID: multitouch: add new PID from Ideacom
HID: multitouch: add support for Atmel maXTouch 03eb:2118
HID: waltop: Add support for tablet with PID 0038
HID: waltop: Replace original rdescs with links
HID: uclogic: Replace original rdescs with links
HID: wacom: Add pad buttons reporting on Intuos4 WL
HID: wacom: report distance for Intuos4 WL
HID: kye: Add support for 3 tablets
...
Linus Torvalds [Wed, 21 Mar 2012 04:04:47 +0000 (21:04 -0700)]
Merge git://git./linux/kernel/git/davem/net-next
Pull networking merge from David Miller:
"1) Move ixgbe driver over to purely page based buffering on receive.
From Alexander Duyck.
2) Add receive packet steering support to e1000e, from Bruce Allan.
3) Convert TCP MD5 support over to RCU, from Eric Dumazet.
4) Reduce cpu usage in handling out-of-order TCP packets on modern
systems, also from Eric Dumazet.
5) Support the IP{,V6}_UNICAST_IF socket options, making the wine
folks happy, from Erich Hoover.
6) Support VLAN trunking from guests in hyperv driver, from Haiyang
Zhang.
7) Support byte-queue-limtis in r8169, from Igor Maravic.
8) Outline code intended for IP_RECVTOS in IP_PKTOPTIONS existed but
was never properly implemented, Jiri Benc fixed that.
9) 64-bit statistics support in r8169 and 8139too, from Junchang Wang.
10) Support kernel side dump filtering by ctmark in netfilter
ctnetlink, from Pablo Neira Ayuso.
11) Support byte-queue-limits in gianfar driver, from Paul Gortmaker.
12) Add new peek socket options to assist with socket migration, from
Pavel Emelyanov.
13) Add sch_plug packet scheduler whose queue is controlled by
userland daemons using explicit freeze and release commands. From
Shriram Rajagopalan.
14) Fix FCOE checksum offload handling on transmit, from Yi Zou."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1846 commits)
Fix pppol2tp getsockname()
Remove printk from rds_sendmsg
ipv6: fix incorrent ipv6 ipsec packet fragment
cpsw: Hook up default ndo_change_mtu.
net: qmi_wwan: fix build error due to cdc-wdm dependecy
netdev: driver: ethernet: Add TI CPSW driver
netdev: driver: ethernet: add cpsw address lookup engine support
phy: add am79c874 PHY support
mlx4_core: fix race on comm channel
bonding: send igmp report for its master
fs_enet: Add MPC5125 FEC support and PHY interface selection
net: bpf_jit: fix BPF_S_LDX_B_MSH compilation
net: update the usage of CHECKSUM_UNNECESSARY
fcoe: use CHECKSUM_UNNECESSARY instead of CHECKSUM_PARTIAL on tx
net: do not do gso for CHECKSUM_UNNECESSARY in netif_needs_gso
ixgbe: Fix issues with SR-IOV loopback when flow control is disabled
net/hyperv: Fix the code handling tx busy
ixgbe: fix namespace issues when FCoE/DCB is not enabled
rtlwifi: Remove unused ETH_ADDR_LEN defines
igbvf: Use ETH_ALEN
...
Fix up fairly trivial conflicts in drivers/isdn/gigaset/interface.c and
drivers/net/usb/{Kconfig,qmi_wwan.c} as per David.
Linus Torvalds [Wed, 21 Mar 2012 01:13:22 +0000 (18:13 -0700)]
Merge branch 'for-3.4' of git://git./linux/kernel/git/tj/wq
Pull workqueue changes from Tejun Heo:
"This contains only one commit which cleans up UP allocation path a
bit."
* 'for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: use percpu allocator for cwq on UP
Linus Torvalds [Wed, 21 Mar 2012 01:11:21 +0000 (18:11 -0700)]
Merge branch 'for-3.4' of git://git./linux/kernel/git/tj/cgroup
Pull cgroup changes from Tejun Heo:
"Out of the 8 commits, one fixes a long-standing locking issue around
tasklist walking and others are cleanups."
* 'for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: Walk task list under tasklist_lock in cgroup_enable_task_cg_list
cgroup: Remove wrong comment on cgroup_enable_task_cg_list()
cgroup: remove cgroup_subsys argument from callbacks
cgroup: remove extra calls to find_existing_css_set
cgroup: replace tasklist_lock with rcu_read_lock
cgroup: simplify double-check locking in cgroup_attach_proc
cgroup: move struct cgroup_pidlist out from the header file
cgroup: remove cgroup_attach_task_current_cg()
Oleg Nesterov [Mon, 19 Mar 2012 16:04:01 +0000 (17:04 +0100)]
exec: move de_thread()->setmax_mm_hiwater_rss() into exec_mmap()
Minor cleanup. de_thread()->setmax_mm_hiwater_rss() looks a bit
strange, move it into exec_mmap() which plays with old_mm.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oleg Nesterov [Mon, 19 Mar 2012 16:03:41 +0000 (17:03 +0100)]
exit_signal: fix the "parent has changed security domain" logic
exit_notify() changes ->exit_signal if the parent already did exec.
This doesn't really work, we are not going to send the signal now
if there is another live thread or the exiting task is traced. The
parent can exec before the last dies or the tracer detaches.
Move this check into do_notify_parent() which actually sends the
signal.
The user-visible change is that we do not change ->exit_signal,
and thus the exiting task is still "clone children" for
do_wait()->eligible_child(__WCLONE). Hopefully this is fine, the
current logic is racy anyway.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oleg Nesterov [Mon, 19 Mar 2012 16:03:22 +0000 (17:03 +0100)]
exit_signal: simplify the "we have changed execution domain" logic
exit_notify() checks "tsk->self_exec_id != tsk->parent_exec_id"
to handle the "we have changed execution domain" case.
We can change do_thread() to always set ->exit_signal = SIGCHLD
and remove this check to simplify the code.
We could change setup_new_exec() instead, this looks more logical
because it increments ->self_exec_id. But note that de_thread()
already resets ->exit_signal if it changes the leader, let's keep
both changes close to each other.
Note that we change ->exit_signal lockless, this changes the rules.
Thereafter ->exit_signal is not stable under tasklist but this is
fine, the only possible change is OLDSIG -> SIGCHLD. This can race
with eligible_child() but the race is harmless. We can race with
reparent_leader() which changes our ->exit_signal in parallel, but
it does the same change to SIGCHLD.
The noticeable user-visible change is that the execing task is not
"visible" to do_wait()->eligible_child(__WCLONE) right after exec.
To me this looks more logical, and this is consistent with mt case.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oleg Nesterov [Wed, 14 Mar 2012 18:55:38 +0000 (19:55 +0100)]
CLONE_PARENT shouldn't allow to set ->exit_signal
The child must not control its ->exit_signal, it is the parent who
decides which signal the child should use for notification.
This means that CLONE_PARENT should not use "clone_flags & CSIGNAL",
the forking task is the sibling of the new process and their parent
doesn't control exit_signal in this case.
This patch uses ->exit_signal of the forking process, but perhaps
we should simply use SIGCHLD.
We read group_leader->exit_signal lockless, this can race with the
ORIGINAL_SIGNAL -> SIGCHLD transition, but this is fine.
Potentially this change allows to kill self_exec_id/parent_exec_id.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Benjamin LaHaise [Tue, 20 Mar 2012 03:57:54 +0000 (03:57 +0000)]
Fix pppol2tp getsockname()
While testing L2TP functionality, I came across a bug in getsockname(). The
IP address returned within the pppol2tp_addr's addr memember was not being
set to the IP address in use. This bug is caused by using inet_sk() on the
wrong socket (the L2TP socket rather than the underlying UDP socket), and was
likely introduced during the addition of L2TPv3 support.
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dave Jones [Mon, 19 Mar 2012 13:01:07 +0000 (13:01 +0000)]
Remove printk from rds_sendmsg
no socket layer outputs a message for this error and neither should rds.
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 20 Mar 2012 18:26:30 +0000 (11:26 -0700)]
Merge tag 'usb-3.3' of git://git./linux/kernel/git/gregkh/usb
Pull USB merge for 3.4-rc1 from Greg KH:
"Here's the big USB merge for the 3.4-rc1 merge window.
Lots of gadget driver reworks here, driver updates, xhci changes, some
new drivers added, usb-serial core reworking to fix some bugs, and
other various minor things.
There are some patches touching arch code, but they have all been
acked by the various arch maintainers."
* tag 'usb-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (302 commits)
net: qmi_wwan: add support for ZTE MF820D
USB: option: add ZTE MF820D
usb: gadget: f_fs: Remove lock is held before freeing checks
USB: option: make interface blacklist work again
usb/ub: deprecate & schedule for removal the "Low Performance USB Block" driver
USB: ohci-pxa27x: add clk_prepare/clk_unprepare calls
USB: use generic platform driver on ath79
USB: EHCI: Add a generic platform device driver
USB: OHCI: Add a generic platform device driver
USB: ftdi_sio: new PID: LUMEL PD12
USB: ftdi_sio: add support for FT-X series devices
USB: serial: mos7840: Fixed MCS7820 device attach problem
usb: Don't make USB_ARCH_HAS_{XHCI,OHCI,EHCI} depend on USB_SUPPORT.
usb gadget: fix a section mismatch when compiling g_ffs with CONFIG_USB_FUNCTIONFS_ETH
USB: ohci-nxp: Remove i2c_write(), use smbus
USB: ohci-nxp: Support for LPC32xx
USB: ohci-nxp: Rename symbols from pnx4008 to nxp
USB: OHCI-HCD: Rename ohci-pnx4008 to ohci-nxp
usb: gadget: Kconfig: fix typo for 'different'
usb: dwc3: pci: fix another failure path in dwc3_pci_probe()
...
Linus Torvalds [Tue, 20 Mar 2012 18:24:39 +0000 (11:24 -0700)]
Merge tag 'tty-3.3' of git://git./linux/kernel/git/gregkh/tty
Pull TTY/serial patches from Greg KH:
"tty and serial merge for 3.4-rc1
Here's the big serial and tty merge for the 3.4-rc1 tree.
There's loads of fixes and reworks in here from Jiri for the tty
layer, and a number of patches from Alan to help try to wrestle the vt
layer into a sane model.
Other than that, lots of driver updates and fixes, and other minor
stuff, all detailed in the shortlog."
* tag 'tty-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (132 commits)
serial: pxa: add clk_prepare/clk_unprepare calls
TTY: Wrong unicode value copied in con_set_unimap()
serial: PL011: clear pending interrupts
serial: bfin-uart: Don't access tty circular buffer in TX DMA interrupt after it is reset.
vt: NULL dereference in vt_do_kdsk_ioctl()
tty: serial: vt8500: fix annotations for probe/remove
serial: remove back and forth conversions in serial_out_sync
serial: use serial_port_in/out vs serial_in/out in 8250
serial: introduce generic port in/out helpers
serial: reduce number of indirections in 8250 code
serial: delete useless void casts in 8250.c
serial: make 8250's serial_in shareable to other drivers.
serial: delete last unused traces of pausing I/O in 8250
pch_uart: Add module parameter descriptions
pch_uart: Use existing default_baud in setup_console
pch_uart: Add user_uartclk parameter
pch_uart: Add Fish River Island II uart clock quirks
pch_uart: Use uartclk instead of base_baud
mpc5200b/uart: select more tolerant uart prescaler on low baudrates
tty: moxa: fix bit test in moxa_start()
...
Linus Torvalds [Tue, 20 Mar 2012 18:23:18 +0000 (11:23 -0700)]
Merge tag 'staging-3.3' of git://git./linux/kernel/git/gregkh/staging
Pull big staging driver updates from Greg KH:
"Here is the big drivers/staging/ merge for 3.4-rc1
Lots of new driver updates here, with the addition of a few new ones,
and only one moving out of the staging tree to the "real" part of the
kernel (the hyperv scsi driver, acked by the scsi maintainer).
There are also loads of cleanups, fixes, and other minor things in
here, all self-contained in the drivers/staging/ tree.
Overall we reversed the recent trend by adding more lines than we
removed:
379 files changed, 37952 insertions(+), 14153 deletions(-)"
* tag 'staging-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (360 commits)
staging/zmem: Use lockdep_assert_held instead of spin_is_locked
Staging: rtl8187se: r8180_wx.c: Cleaned up comments
Staging: rtl8187se: r8180_wx.c: Removed old comments
Staging: rtl8187se: r8180_dm.c: Removed old comments
Staging: android: ram_console.c:
Staging: rtl8187se: r8180_dm.c: Fix comments
Staging: rtl8187se: r8180_dm.c: Fix spacing issues
Staging: rtl8187se: r8180_dm.c Fixed indentation issues
Staging: rtl8187se: r8180_dm.c: Fix brackets
Staging: rtl8187se: r8180_dm.c: Removed spaces before tab stop
staging: vme: fix section mismatches in linux-next
20120314
Staging: rtl8187se: r8180_core.c: Fix some long line issues
Staging: rtl8187se: r8180_core.c: Fix some spacing issues
Staging: rtl8187se: r8180_core.c: Removed trailing spaces
staging: mei: remove driver internal versioning
Staging: rtl8187se: r8180_core.c: Cleaned up if statement
staging: ozwpan depends on NET
staging: ozwpan: added maintainer for ozwpan driver
staging/mei: propagate error codes up in the write flow
drivers:staging:mei Fix some typos in staging/mei
...
Linus Torvalds [Tue, 20 Mar 2012 18:16:20 +0000 (11:16 -0700)]
Merge tag 'driver-core-3.3' of git://git./linux/kernel/git/gregkh/driver-core
Pull driver core patches for 3.4-rc1 from Greg KH:
"Here's the big driver core merge for 3.4-rc1.
Lots of various things here, sysfs fixes/tweaks (with the nlink
breakage reverted), dynamic debugging updates, w1 drivers, hyperv
driver updates, and a variety of other bits and pieces, full
information in the shortlog."
* tag 'driver-core-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (78 commits)
Tools: hv: Support enumeration from all the pools
Tools: hv: Fully support the new KVP verbs in the user level daemon
Drivers: hv: Support the newly introduced KVP messages in the driver
Drivers: hv: Add new message types to enhance KVP
regulator: Support driver probe deferral
Revert "sysfs: Kill nlink counting."
uevent: send events in correct order according to seqnum (v3)
driver core: minor comment formatting cleanups
driver core: move the deferred probe pointer into the private area
drivercore: Add driver probe deferral mechanism
DS2781 Maxim Stand-Alone Fuel Gauge battery and w1 slave drivers
w1_bq27000: Only one thread can access the bq27000 at a time.
w1_bq27000 - remove w1_bq27000_write
w1_bq27000: remove unnecessary NULL test.
sysfs: Fix memory leak in sysfs_sd_setsecdata().
intel_idle: Revert change of auto_demotion_disable_flags for Nehalem
w1: Fix w1_bq27000
driver-core: documentation: fix up Greg's email address
powernow-k6: Really enable auto-loading
powernow-k7: Fix CPU family number
...
Linus Torvalds [Tue, 20 Mar 2012 18:15:18 +0000 (11:15 -0700)]
Merge tag 'char-misc-3.3' of git://git./linux/kernel/git/gregkh/char-misc
Pull char and misc patches for 3.4-rc1 from Greg KH:
"Not much here, just a few minor fixes and some conversions to the
module_*_driver() functions, making the codebase smaller."
* tag 'char-misc-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
misc: bmp085: Use unsigned long to store jiffies
char/ramoops: included linux/err.h twice
misc: bmp085: Handle jiffies overflow correctly
misc: fsa9480: Remove obsolete cleanup for clientdata
char: Fix typo in tlclk.c
char: Fix typo in viotape.c
cs5535-mfgpt: don't call __init function from __devinit
MISC: convert drivers/misc/* to use module_spi_driver()
MISC: convert drivers/misc/* to use module_i2c_driver()
MISC: convert drivers/misc/* to use module_platform_driver()
Dan Carpenter [Tue, 20 Mar 2012 16:58:06 +0000 (16:58 +0000)]
AFS: checking wrong bit in afs_readpages()
We should be testing "if (vnode->flags & (1 << 4))" instead of
"if (vnode->flags & 4) {". The current test checks if the data was
modified instead of deleted.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 20 Mar 2012 17:32:09 +0000 (10:32 -0700)]
Merge branch 'timers-core-for-linus' of git://git./linux/kernel/git/tip/tip
Pull timer changes for v3.4 from Ingo Molnar
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
ntp: Fix integer overflow when setting time
math: Introduce div64_long
cs5535-clockevt: Allow the MFGPT IRQ to be shared
cs5535-clockevt: Don't ignore MFGPT on SMP-capable kernels
x86/time: Eliminate unused irq0_irqs counter
clocksource: scx200_hrt: Fix the build
x86/tsc: Reduce the TSC sync check time for core-siblings
timer: Fix bad idle check on irq entry
nohz: Remove ts->Einidle checks before restarting the tick
nohz: Remove update_ts_time_stat from tick_nohz_start_idle
clockevents: Leave the broadcast device in shutdown mode when not needed
clocksource: Load the ACPI PM clocksource asynchronously
clocksource: scx200_hrt: Convert scx200 to use clocksource_register_hz
clocksource: Get rid of clocksource_calc_mult_shift()
clocksource: dbx500: convert to clocksource_register_hz()
clocksource: scx200_hrt: use pr_<level> instead of printk
time: Move common updates to a function
time: Reorder so the hot data is together
time: Remove most of xtime_lock usage in timekeeping.c
ntp: Add ntp_lock to replace xtime_locking
...
Linus Torvalds [Tue, 20 Mar 2012 17:31:44 +0000 (10:31 -0700)]
Merge branch 'sched-core-for-linus' of git://git./linux/kernel/git/tip/tip
Pull scheduler changes for v3.4 from Ingo Molnar
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
printk: Make it compile with !CONFIG_PRINTK
sched/x86: Fix overflow in cyc2ns_offset
sched: Fix nohz load accounting -- again!
sched: Update yield() docs
printk/sched: Introduce special printk_sched() for those awkward moments
sched/nohz: Correctly initialize 'next_balance' in 'nohz' idle balancer
sched: Cleanup cpu_active madness
sched: Fix load-balance wreckage
sched: Clean up parameter passing of proc_sched_autogroup_set_nice()
sched: Ditch per cgroup task lists for load-balancing
sched: Rename load-balancing fields
sched: Move load-balancing arguments into helper struct
sched/rt: Do not submit new work when PI-blocked
sched/rt: Prevent idle task boosting
sched/wait: Add __wake_up_all_locked() API
sched/rt: Document scheduler related skip-resched-check sites
sched/rt: Use schedule_preempt_disabled()
sched/rt: Add schedule_preempt_disabled()
sched/rt: Do not throttle when PI boosting
sched/rt: Keep period timer ticking when rt throttling is active
...
Linus Torvalds [Tue, 20 Mar 2012 17:29:15 +0000 (10:29 -0700)]
Merge branch 'perf-core-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf events changes for v3.4 from Ingo Molnar:
- New "hardware based branch profiling" feature both on the kernel and
the tooling side, on CPUs that support it. (modern x86 Intel CPUs
with the 'LBR' hardware feature currently.)
This new feature is basically a sophisticated 'magnifying glass' for
branch execution - something that is pretty difficult to extract from
regular, function histogram centric profiles.
The simplest mode is activated via 'perf record -b', and the result
looks like this in perf report:
$ perf record -b any_call,u -e cycles:u branchy
$ perf report -b --sort=symbol
52.34% [.] main [.] f1
24.04% [.] f1 [.] f3
23.60% [.] f1 [.] f2
0.01% [k] _IO_new_file_xsputn [k] _IO_file_overflow
0.01% [k] _IO_vfprintf_internal [k] _IO_new_file_xsputn
0.01% [k] _IO_vfprintf_internal [k] strchrnul
0.01% [k] __printf [k] _IO_vfprintf_internal
0.01% [k] main [k] __printf
This output shows from/to branch columns and shows the highest
percentage (from,to) jump combinations - i.e. the most likely taken
branches in the system. "branches" can also include function calls
and any other synchronous and asynchronous transitions of the
instruction pointer that are not 'next instruction' - such as system
calls, traps, interrupts, etc.
This feature comes with (hopefully intuitive) flat ascii and TUI
support in perf report.
- Various 'perf annotate' visual improvements for us assembly junkies.
It will now recognize function calls in the TUI and by hitting enter
you can follow the call (recursively) and back, amongst other
improvements.
- Multiple threads/processes recording support in perf record, perf
stat, perf top - which is activated via a comma-list of PIDs:
perf top -p 21483,21485
perf stat -p 21483,21485 -ddd
perf record -p 21483,21485
- Support for per UID views, via the --uid paramter to perf top, perf
report, etc. For example 'perf top --uid mingo' will only show the
tasks that I am running, excluding other users, root, etc.
- Jump label restructurings and improvements - this includes the
factoring out of the (hopefully much clearer) include/linux/static_key.h
generic facility:
struct static_key key = STATIC_KEY_INIT_FALSE;
...
if (static_key_false(&key))
do unlikely code
else
do likely code
...
static_key_slow_inc();
...
static_key_slow_inc();
...
The static_key_false() branch will be generated into the code with as
little impact to the likely code path as possible. the
static_key_slow_*() APIs flip the branch via live kernel code patching.
This facility can now be used more widely within the kernel to
micro-optimize hot branches whose likelihood matches the static-key
usage and fast/slow cost patterns.
- SW function tracer improvements: perf support and filtering support.
- Various hardenings of the perf.data ABI, to make older perf.data's
smoother on newer tool versions, to make new features integrate more
smoothly, to support cross-endian recording/analyzing workflows
better, etc.
- Restructuring of the kprobes code, the splitting out of 'optprobes',
and a corner case bugfix.
- Allow the tracing of kernel console output (printk).
- Improvements/fixes to user-space RDPMC support, allowing user-space
self-profiling code to extract PMU counts without performing any
system calls, while playing nice with the kernel side.
- 'perf bench' improvements
- ... and lots of internal restructurings, cleanups and fixes that made
these features possible. And, as usual this list is incomplete as
there were also lots of other improvements
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (120 commits)
perf report: Fix annotate double quit issue in branch view mode
perf report: Remove duplicate annotate choice in branch view mode
perf/x86: Prettify pmu config literals
perf report: Enable TUI in branch view mode
perf report: Auto-detect branch stack sampling mode
perf record: Add HEADER_BRANCH_STACK tag
perf record: Provide default branch stack sampling mode option
perf tools: Make perf able to read files from older ABIs
perf tools: Fix ABI compatibility bug in print_event_desc()
perf tools: Enable reading of perf.data files from different ABI rev
perf: Add ABI reference sizes
perf report: Add support for taken branch sampling
perf record: Add support for sampling taken branch
perf tools: Add code to support PERF_SAMPLE_BRANCH_STACK
x86/kprobes: Split out optprobe related code to kprobes-opt.c
x86/kprobes: Fix a bug which can modify kernel code permanently
x86/kprobes: Fix instruction recovery on optimized path
perf: Add callback to flush branch_stack on context switch
perf: Disable PERF_SAMPLE_BRANCH_* when not supported
perf/x86: Add LBR software filter support for Intel CPUs
...
Linus Torvalds [Tue, 20 Mar 2012 17:28:56 +0000 (10:28 -0700)]
Merge branch 'irq-core-for-linus' of git://git./linux/kernel/git/tip/tip
Pull irq/core changes for v3.4 from Ingo Molnar
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Remove paranoid warnons and bogus fixups
genirq: Flush the irq thread on synchronization
genirq: Get rid of unnecessary IRQTF_DIED flag
genirq: No need to check IRQTF_DIED before stopping a thread handler
genirq: Get rid of unnecessary irqaction field in task_struct
genirq: Fix incorrect check for forced IRQ thread handler
softirq: Reduce invoke_softirq() code duplication
genirq: Fix long-term regression in genirq irq_set_irq_type() handling
x86-32/irq: Don't switch to irq stack for a user-mode irq
Linus Torvalds [Tue, 20 Mar 2012 00:12:34 +0000 (17:12 -0700)]
Merge branch 'core-rcu-for-linus' of git://git./linux/kernel/git/tip/tip
Pull RCU changes for v3.4 from Ingo Molnar. The major features of this
series are:
- making RCU more aggressive about entering dyntick-idle mode in order
to improve energy efficiency
- converting a few more call_rcu()s to kfree_rcu()s
- applying a number of rcutree fixes and cleanups to rcutiny
- removing CONFIG_SMP #ifdefs from treercu
- allowing RCU CPU stall times to be set via sysfs
- adding CPU-stall capability to rcutorture
- adding more RCU-abuse diagnostics
- updating documentation
- fixing yet more issues located by the still-ongoing top-to-bottom
inspection of RCU, this time with a special focus on the CPU-hotplug
code path.
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (48 commits)
rcu: Stop spurious warnings from synchronize_sched_expedited
rcu: Hold off RCU_FAST_NO_HZ after timer posted
rcu: Eliminate softirq-mediated RCU_FAST_NO_HZ idle-entry loop
rcu: Add RCU_NONIDLE() for idle-loop RCU read-side critical sections
rcu: Allow nesting of rcu_idle_enter() and rcu_idle_exit()
rcu: Remove redundant check for rcu_head misalignment
PTR_ERR should be called before its argument is cleared.
rcu: Convert WARN_ON_ONCE() in rcu_lock_acquire() to lockdep
rcu: Trace only after NULL-pointer check
rcu: Call out dangers of expedited RCU primitives
rcu: Rework detection of use of RCU by offline CPUs
lockdep: Add CPU-idle/offline warning to lockdep-RCU splat
rcu: No interrupt disabling for rcu_prepare_for_idle()
rcu: Move synchronize_sched_expedited() to rcutree.c
rcu: Check for illegal use of RCU from offlined CPUs
rcu: Update stall-warning documentation
rcu: Add CPU-stall capability to rcutorture
rcu: Make documentation give more realistic rcutorture duration
rcutorture: Permit holding off CPU-hotplug operations during boot
rcu: Print scheduling-clock information on RCU CPU stall-warning messages
...
Jiri Kosina [Tue, 20 Mar 2012 12:18:05 +0000 (13:18 +0100)]
Merge branch 'upstream' into for-linus
Conflicts:
drivers/hid/Makefile
Andreas Nielsen [Mon, 19 Mar 2012 14:41:03 +0000 (15:41 +0100)]
HID: multitouch: add PID for Fructel product
Adds multitouch support for the Gametel Android game controller.
The multitouch events are emulated by the Gametel device. Each physical button
is configured to generate a MT event on a specific coordinate. This seems to be
the only way for us to support Android games that doesn't support HID gamepads.
It is possible to inject MT events at Android level, but this requires root on
the phone.
Signed-off-by: Andreas Nielsen <eas@svep.se>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Jiri Kosina [Tue, 20 Mar 2012 12:05:18 +0000 (13:05 +0100)]
Merge branches 'roccat' and 'wacom' into for-linus
Jiri Kosina [Tue, 20 Mar 2012 12:04:25 +0000 (13:04 +0100)]
Merge branches 'battery-scope', 'logitech' and 'multitouch' into for-linus
Gao feng [Mon, 19 Mar 2012 22:36:10 +0000 (22:36 +0000)]
ipv6: fix incorrent ipv6 ipsec packet fragment
Since commit
299b0767(ipv6: Fix IPsec slowpath fragmentation problem)
In func ip6_append_data,after call skb_put(skb, fraglen + dst_exthdrlen)
the skb->len contains dst_exthdrlen,and we don't reduce dst_exthdrlen at last
This will make fraggap>0 in next "while cycle",and cause the size of skb incorrent
Fix this by reserve headroom for dst_exthdrlen.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 20 Mar 2012 04:33:59 +0000 (00:33 -0400)]
cpsw: Hook up default ndo_change_mtu.
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 20 Mar 2012 00:11:15 +0000 (17:11 -0700)]
Merge branch 'core-locking-for-linus' of git://git./linux/kernel/git/tip/tip
Pull core/locking changes for v3.4 from Ingo Molnar
* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex: Simplify return logic
futex: Cover all PI opcodes with cmpxchg enabled check
Linus Torvalds [Tue, 20 Mar 2012 00:10:38 +0000 (17:10 -0700)]
Merge branch 'core-iommu-for-linus' of git://git./linux/kernel/git/tip/tip
Pull core/iommu changes for v3.4 from Ingo Molnar
* 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/iommu/intel: Increase the number of iommus supported to MAX_IO_APICS
x86/iommu/intel: Fix identity mapping for sandy bridge
Linus Torvalds [Mon, 19 Mar 2012 23:37:28 +0000 (16:37 -0700)]
Merge branch 'dcache-word-accesses'
* branch 'dcache-word-accesses':
vfs: use 'unsigned long' accesses for dcache name comparison and hashing
This does the name hashing and lookup using word-sized accesses when
that is efficient, namely on x86 (although any little-endian machine
with good unaligned accesses would do).
It does very much depend on little-endian logic, but it's a very hot
couple of functions under some real loads, and this patch improves the
performance of __d_lookup_rcu() and link_path_walk() by up to about 30%.
Giving a 10% improvement on some very pathname-heavy benchmarks.
Because we do make unaligned accesses past the filename, the
optimization is disabled when CONFIG_DEBUG_PAGEALLOC is active, and we
effectively depend on the fact that on x86 we don't really ever have the
last page of usable RAM followed immediately by any IO memory (due to
ACPI tables, BIOS buffer areas etc).
Some of the bit operations we do are a bit "subtle". It's commented,
but you do need to really think about the code. Or just consider it
black magic.
Thanks to people on G+ for some of the optimized bit tricks.
Linus Torvalds [Mon, 19 Mar 2012 23:19:53 +0000 (16:19 -0700)]
vfs: get rid of batshit-insane pointless dentry hash calculations
For some odd historical reason, the final mixing round for the dentry
cache hash table lookup had an insane "xor with big constant" logic. In
two places.
The big constant that is being xor'ed is GOLDEN_RATIO_PRIME, which is a
fairly random-looking number that is designed to be *multiplied* with so
that the bits get spread out over a whole long-word.
But xor'ing with it is insane. It doesn't really even change the hash -
it really only shifts the hash around in the hash table. To make
matters worse, the insane big constant is different on 32-bit and 64-bit
builds, even though the name hash bits we use are always 32-bit (and the
bits from the pointer we mix in effectively are too).
It's all total voodoo programming, in other words.
Now, some testing and analysis of the hash chains shows that the rest of
the hash function seems to be fairly good. It does pick the right bits
of the parent dentry pointer, for example, and while it's generally a
bad idea to use an xor to mix down the upper bits (because if there is a
repeating pattern, the xor can cause "destructive interference"), it
seems to not have been a disaster.
For example, replacing the hash with the normal "hash_long()" code (that
uses the GOLDEN_RATIO_PRIME constant correctly, btw) actually just makes
the hash worse. The hand-picked hash knew which bits of the pointer had
the highest entropy, and hash_long() ends up mixing bits less optimally
at least in some trivial tests.
So the hash function overall seems fine, it just has that really odd
"shift result around by a constant xor".
So get rid of the silly xor, and replace the down-mixing of the bits
with an add instead of an xor that tends to not have the same kind of
destructive interference issues. Some stats on the resulting hash
chains shows that they look statistically identical before and after,
but the code is simpler and no longer makes you go "WTF?".
Also, the incoming hash really is just "unsigned int", not a long, and
there's no real point to worry about the high 26 bits of the dentry
pointer for the 64-bit case, because they are all going to be identical
anyway.
So also change the hashing to be done in the more natural 'unsigned int'
that is the real size of the actual hashed data anyway.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bjørn Mork [Mon, 19 Mar 2012 07:48:29 +0000 (07:48 +0000)]
net: qmi_wwan: fix build error due to cdc-wdm dependecy
Fixes:
drivers/built-in.o: In function `qmi_wwan_bind_shared':
qmi_wwan.c:(.text+0x25b686): undefined reference to `usb_cdc_wdm_register'
make[1]: *** [.tmp_vmlinux1] Error 1
Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mugunthan V N [Sun, 18 Mar 2012 20:17:54 +0000 (20:17 +0000)]
netdev: driver: ethernet: Add TI CPSW driver
This patch adds support for TI's CPSW driver.
The three port switch gigabit ethernet subsystem provides ethernet packet
communication and can be configured as an ethernet switch. Supports
10/100/1000 Mbps.
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Sriramakrishnan A G <srk@ti.com>
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mugunthan V N [Sun, 18 Mar 2012 20:17:53 +0000 (20:17 +0000)]
netdev: driver: ethernet: add cpsw address lookup engine support
TI CPSW ethernet switch has a built-in address lookup engine. This patch adds
the code necessary for programming this module.
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiko Schocher [Sun, 18 Mar 2012 11:03:05 +0000 (11:03 +0000)]
phy: add am79c874 PHY support
Signed-off-by: Heiko Schocher <hs@denx.de>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eugenia Emantayev [Sun, 18 Mar 2012 04:32:08 +0000 (04:32 +0000)]
mlx4_core: fix race on comm channel
Prevent race condition between commands on comm channel.
Happened while unloading the driver when switching from
event to polling mode. VF got completion on the last command
before switching to polling mode, but toggle was not changed.
After the fix - VF will not write the next command before
toggle is updated.
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Pan(潘卫平) [Sat, 17 Mar 2012 17:23:27 +0000 (17:23 +0000)]
bonding: send igmp report for its master
Liang Zheng(lzheng@redhat.com) found that in the following topo,
bonding does not send igmp report when we trigger a fail-over of bonding.
eth0--
|-- bond0 -- br0
eth1--
modprobe bonding mode=1 miimon=100 resend_igmp=10
ifconfig bond0 up
ifenslave bond0 eth0 eth1
brctl addbr br0
ifconfig br0 192.168.100.2/24 up
brctl addif br0 bond0
Add 192.168.100.2(br0) into a multicast group, like 224.10.10.10,
then trigger a fali-over in bonding.
You can see that parameter "resend_igmp" does not work.
The reason is that when we add br0 into a multicast group,
it does not propagate multicast knowledge down to its ports.
If we choose to propagate multicast knowledge down to all ports for bridge,
then we have to track every change that is done to bridge, and keep a backup
for all ports. It is hard to track, I think.
Instead I choose to modify bonding to send igmp report for its master.
Changelog:
V2: correct comments
V3: move this check into bond_resend_igmp_join_requests()
V4: only send igmp reports if bond is enslaved to a bridge
Signed-off-by: Weiping Pan <panweiping3@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Ermakov [Sat, 17 Mar 2012 13:10:50 +0000 (13:10 +0000)]
fs_enet: Add MPC5125 FEC support and PHY interface selection
Add compatible string for MPC5125 FEC. The FEC on MPC5125 additionally
supports RMII PHY interface. Configure controller/PHY interface type
according to the optional phy-connection-type property in the ethernet
node. This property should be either "rmii" or "mii".
Signed-off-by: Vladimir Ermakov <vooon341@gmail.com>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 18 Mar 2012 02:40:48 +0000 (02:40 +0000)]
net: bpf_jit: fix BPF_S_LDX_B_MSH compilation
Matt Evans spotted that x86 bpf_jit was incorrectly handling negative
constant offsets in BPF_S_LDX_B_MSH instruction.
We need to abort JIT compilation like we do in common_load so that
filter uses the interpreter code and can call __load_pointer()
Reference: http://lists.openwall.net/netdev/2011/07/19/11
Thanks to Indan Zupancic to bring back this issue.
Reported-by: Matt Evans <matt@ozlabs.org>
Reported-by: Indan Zupancic <indan@nul.nu>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yi Zou [Mon, 19 Mar 2012 11:12:41 +0000 (11:12 +0000)]
net: update the usage of CHECKSUM_UNNECESSARY
As suggested by Ben, this adds the clarification on the usage of
CHECKSUM_UNNECESSARY on the outgoing patch. Also add the usage
description of NETIF_F_FCOE_CRC and CHECKSUM_UNNECESSARY
for the kernel FCoE protocol driver.
This is a follow-up to the following:
http://patchwork.ozlabs.org/patch/147315/
Signed-off-by: Yi Zou <yi.zou@intel.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: www.Open-FCoE.org <devel@open-fcoe.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yi Zou [Fri, 16 Mar 2012 23:08:12 +0000 (23:08 +0000)]
fcoe: use CHECKSUM_UNNECESSARY instead of CHECKSUM_PARTIAL on tx
Fix a bug when using 'ethtool -K ethx tx off' to turn off tx ip checksum,
FCoE CRC offload should not be impacte. The skb_checksum_help() is needed
only if it's not FCoE traffic for ip checksum, regardless of ethtool toggling
the tx ip checksum on or off. Instead of using CHECKSUM_PARTIAL, we will
use CHECKSUM_UNNECESSARY as a proper indication to avoid sw ip checksum
on FCoE frames.
Ref. to original discussion thread:
http://patchwork.ozlabs.org/patch/146567/
CC: "James E.J. Bottomley" <JBottomley@parallels.com>
CC: Robert Love <robert.w.love@intel.com>
Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yi Zou [Fri, 16 Mar 2012 23:08:11 +0000 (23:08 +0000)]
net: do not do gso for CHECKSUM_UNNECESSARY in netif_needs_gso
This is related to fixing the bug of dropping FCoE frames when disabling tx ip
checksum by 'ethtool -K ethx tx off'. The FCoE protocol stack driver would
use CHECKSUM_UNNECESSARY on tx path instead of CHECKSUM_PARTIAL (as indicated in
the 2/2 of this series). To do so, netif_needs_gso() has to be changed here to
not do gso for both CHECKSUM_PARTIAL and CHECKSUM_UNNECESSARY.
Ref. to original discussion thread:
http://patchwork.ozlabs.org/patch/146567/
Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Fri, 16 Mar 2012 23:07:34 +0000 (23:07 +0000)]
ixgbe: Fix issues with SR-IOV loopback when flow control is disabled
This patch allows us to avoid a Tx hang when SR-IOV is enabled. This hang
can be triggered by sending small packets at a rate that was triggering Rx
missed errors from the adapter while the internal Tx switch and at least
one VF are enabled.
This was all due to the fact that under heavy stress the Rx FIFO never
drained below the flow control high water mark. This resulted in the Tx
FIFO being head of line blocked due to the fact that it relies on the flow
control high water mark to determine when it is acceptable for the Tx to
place a packet in the Rx FIFO.
The resolution for this is to set the FCRTH value to the RXPBSIZE - 32 so
that even if the ring is almost completely full we can still place Tx
packets on the Rx ring and drop incoming Rx traffic if we do not have
sufficient space available in the Rx FIFO.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Haiyang Zhang [Mon, 19 Mar 2012 21:27:06 +0000 (17:27 -0400)]
net/hyperv: Fix the code handling tx busy
Instead of dropping the packet, we keep the skb buffer, and return
NETDEV_TX_BUSY to let upper layer retry send. This will not cause
endless loop, because the host is taking data away from ring buffer,
and we have called the stop_queue before returning NETDEV_TX_BUSY.
The stop_queue was called in the function netvsc_send() in file
netvsc.c, then it returns to rndis_filter_send(), which returns to
netvsc_start_xmit() in file netvsc_drv.c. So the NETDEV_TX_BUSY is
indeed returned AFTER queue is stopped.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 19 Mar 2012 21:24:27 +0000 (17:24 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next
Jeff Kirsher [Sat, 18 Feb 2012 07:08:14 +0000 (07:08 +0000)]
ixgbe: fix namespace issues when FCoE/DCB is not enabled
Resolve namespace issues when FCoE or DCB is not enabled.
The issue is with certain configurations we end up with namespace
problems. A simple example:
ixgbe_main.c
- defines func A()
- uses func A()
ixgbe_fcoe.c
- uses func A()
ixgbe.h
- has prototype for func A()
For default (FCoE included) all is good. But when it isn't the namespace
checker complains about how func A() could be static.
To resolve this, created a ixgbe_lib file to contain functions used
by DCB/FCoE and their helper functions so that they are always in
namespace whether or not DCB/FCoE is enabled.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Joe Perches [Sun, 18 Mar 2012 17:37:59 +0000 (17:37 +0000)]
rtlwifi: Remove unused ETH_ADDR_LEN defines
Just neatening.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Sun, 18 Mar 2012 17:37:58 +0000 (17:37 +0000)]
igbvf: Use ETH_ALEN
Remove an unnecessary #define and use memcpy
instead of a loop to copy an ethernet address.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Sun, 18 Mar 2012 17:37:57 +0000 (17:37 +0000)]
atlx: Use ETH_ALEN
No need for yet another #define for this.
Convert NODE_ADDRESS_SIZE use to ETH_ALEN and remove #define.
Use memcpy instead of a loop to copy an address.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Sun, 18 Mar 2012 17:37:56 +0000 (17:37 +0000)]
if_vlan: Remove VLAN_ETH_ALEN define and the 1 use of it
Just use ETH_ALEN.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 18 Mar 2012 11:07:47 +0000 (11:07 +0000)]
tcp: reduce out_of_order memory use
With increasing receive window sizes, but speed of light not improved
that much, out of order queue can contain a huge number of skbs, waiting
to be moved to receive_queue when missing packets can fill the holes.
Some devices happen to use fat skbs (truesize of 4096 + sizeof(struct
sk_buff)) to store regular (MTU <= 1500) frames. This makes highly
probable sk_rmem_alloc hits sk_rcvbuf limit, which can be 4Mbytes in
many cases.
When limit is hit, tcp stack calls tcp_collapse_ofo_queue(), a true
latency killer and cpu cache blower.
Doing the coalescing attempt each time we add a frame in ofo queue
permits to keep memory use tight and in many cases avoid the
tcp_collapse() thing later.
Tested on various wireless setups (b43, ath9k, ...) known to use big skb
truesize, this patch removed the "packets collapsed in receive queue due
to low socket buffer" I had before.
This also reduced average memory used by tcp sockets.
With help from Neal Cardwell.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: H.K. Jerry Chu <hkchu@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 18 Mar 2012 11:06:44 +0000 (11:06 +0000)]
tcp: introduce tcp_data_queue_ofo
Split tcp_data_queue() in two parts for better readability.
tcp_data_queue_ofo() is responsible for queueing incoming skb into out
of order queue.
Change code layout so that the skb_set_owner_r() is performed only if
skb is not dropped.
This is a preliminary patch before "reduce out_of_order memory use"
following patch.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: H.K. Jerry Chu <hkchu@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Kravkov [Sun, 18 Mar 2012 10:33:45 +0000 (10:33 +0000)]
bnx2x: validate FW trace prior to its printing
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Sun, 18 Mar 2012 10:33:44 +0000 (10:33 +0000)]
bnx2x: consistent statistics for old FW
Previously applied patch making the bnx2x statistics consistent
did not apply to old FWs. This remedies it, extending the consistent
behaviour to all drivers.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Reported-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Kravkov [Sun, 18 Mar 2012 10:33:43 +0000 (10:33 +0000)]
bnx2x: changed iscsi/fcoe mac init and macros
This includes changes in macros to better distinguish between the two
protocols, and slightly changed the way their macs are set.
Notice this file contains string print lines with more than 80 characters,
as to not break prints.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Kravkov [Sun, 18 Mar 2012 10:33:42 +0000 (10:33 +0000)]
bnx2x: added TLV_NOT_FOUND flags to the dcb
The new error flags are supported by the bnx2x FW.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Kravkov [Sun, 18 Mar 2012 10:33:41 +0000 (10:33 +0000)]
bnx2x: changed initial dcb configuration
The changes were mostly made to enable back-to-back data flow with dcb.
Other changes were simply deemed as a better 'clean' initial configuration.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Sun, 18 Mar 2012 10:33:40 +0000 (10:33 +0000)]
bnx2x: removed dcb unused code
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Sun, 18 Mar 2012 10:33:39 +0000 (10:33 +0000)]
bnx2x: reduced sparse warnings
This patch reduces sparse warnings in the bnx2x code,
mostly by changing functions into static and changing
initialization of structures.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Merav Sicron [Sun, 18 Mar 2012 10:33:38 +0000 (10:33 +0000)]
bnx2x: revised driver prints
We've revised driver prints, changing the mask of existing prints
to allow better control over the debug messages, added prints to
error scenarios, removed unnecessary prints and corrected some spelling.
Please note that this patch contains lines with over 80 characters,
as string messages were kept in a single line.
Signed-off-by: Merav Sicron <meravs@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Sun, 18 Mar 2012 10:33:37 +0000 (10:33 +0000)]
bnx2x: added 'likely' to fast-path skb existence
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ajit Khaparde [Sun, 18 Mar 2012 06:23:41 +0000 (06:23 +0000)]
be2net: fix programming of VLAN tags for VF
Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ajit Khaparde [Sun, 18 Mar 2012 06:23:31 +0000 (06:23 +0000)]
be2net: Fix number of vlan slots in flex mode
In flex10 mode the number of vlan slots supported is halved.
Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ajit Khaparde [Sun, 18 Mar 2012 06:23:21 +0000 (06:23 +0000)]
be2net: Program secondary UC MAC address into MAC filter
Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ajit Khaparde [Sun, 18 Mar 2012 06:23:11 +0000 (06:23 +0000)]
be2net: enable WOL by default if h/w supports it
Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 19 Mar 2012 20:46:22 +0000 (16:46 -0400)]
Merge branch 'gianfar-bql' of git://git./linux/kernel/git/paulg/linux
Alexander Duyck [Sat, 11 Feb 2012 07:18:57 +0000 (07:18 +0000)]
ixgbe: Correct flag values set by ixgbe_fix_features
This patch replaces the variable name data with the variable name features
for ixgbe_fix_features and ixgbe_set_features. This helps to make some
issues more obvious such as the fact that we were disabling Rx VLAN tag
stripping when we should have been forcing it to be enabled when DCB is
enabled.
In addition there was deprecated code present that was disabling the LRO
flag if we had the itr value set too low. I have updated this logic so
that we will now allow the LRO flag to be set, but will not enable RSC
until the rx-usecs value is high enough to allow enough time for Rx packet
coalescing.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Wed, 8 Feb 2012 07:51:53 +0000 (07:51 +0000)]
ixgbe: Add support for enabling UDP RSS via the ethtool rx-flow-hash command
This patch adds support for enabling or disabling UDP RSS via the
ethtool -N rx-flow-hash command.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Wed, 8 Feb 2012 07:51:47 +0000 (07:51 +0000)]
ixgbe: Whitespace cleanups
This patch contains several fixes for formatting in regards to whitespace.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Wed, 8 Feb 2012 07:51:42 +0000 (07:51 +0000)]
ixgbe: Two minor fixes for RSS and FDIR set queues functions
This change fixes two minor issues. The first was the fact that we were
setting the return value to false twice in the set_rss_queues function.
The second is the fact that we should have been using "min_t(int," instead
of "min((int)" in set_fdir_queues.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Wed, 8 Feb 2012 07:51:37 +0000 (07:51 +0000)]
ixgbe: drop err_eeprom tag which is at same location as err_sw_init
The err_eeprom and err_sw_init tags both go to the same location. So
instead of maintaining two tags this patch combines them so we only use
err_sw_init.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Wed, 8 Feb 2012 07:51:27 +0000 (07:51 +0000)]
ixgbe: Move poll routine in order to improve readability
This change relocates the ixgbe_poll routine so it is right next to the
interrupt routine that schedules and calls it.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Wed, 8 Feb 2012 07:51:22 +0000 (07:51 +0000)]
ixgbe: cleanup logic for the service timer and VF hang detection
This change just cleans up some of the logic in the service_timer function
so that we can avoid unnecessary swapping of the ready value between true to
false and back to true.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Wed, 8 Feb 2012 07:51:16 +0000 (07:51 +0000)]
ixgbe: Update layout of ixgbe_ring structure to improve cache performance
This change makes it so that only the 2nd cache line in the ring structure
should see frequent updates. The advantage to this is that it should
reduce the amount of cross CPU cache bouncing since only the 2nd cache line
will be changing between most network transactions.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Wed, 8 Feb 2012 07:51:11 +0000 (07:51 +0000)]
ixgbe: Store Tx flags and protocol information to tx_buffer sooner
This change makes it so that we store the tx_flags and protocol information
to the tx_buffer_info structure sooner. This allows us to avoid unnecessary
read/write transactions since we are placing the data in the final location
earlier.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
David S. Miller [Mon, 19 Mar 2012 03:29:41 +0000 (23:29 -0400)]
Merge git://git./linux/kernel/git/davem/net
Linus Torvalds [Sun, 18 Mar 2012 23:15:34 +0000 (16:15 -0700)]
Linux 3.3
Paul Gortmaker [Sun, 18 Mar 2012 21:11:22 +0000 (17:11 -0400)]
gianfar: use netif_tx_queue_stopped instead of __netif_subqueue_stopped
The __netif_subqueue_stopped() just does the following:
struct netdev_queue *txq = netdev_get_tx_queue(dev, queue_index);
return netif_tx_queue_stopped(txq);
and since we already have the txq in scope, we can just call that
directly in this case.
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Jason Baron [Fri, 16 Mar 2012 20:34:03 +0000 (16:34 -0400)]
Don't limit non-nested epoll paths
Commit
28d82dc1c4ed ("epoll: limit paths") that I did to limit the
number of possible wakeup paths in epoll is causing a few applications
to longer work (dovecot for one).
The original patch is really about limiting the amount of epoll nesting
(since epoll fds can be attached to other fds). Thus, we probably can
allow an unlimited number of paths of depth 1. My current patch limits
it at 1000. And enforce the limits on paths that have a greater depth.
This is captured in: https://bugzilla.redhat.com/show_bug.cgi?id=681578
Signed-off-by: Jason Baron <jbaron@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 18 Mar 2012 02:22:24 +0000 (19:22 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking changes from David Miller:
"1) icmp6_dst_alloc() returns NULL instead of ERR_PTR() leading to
crashes, particularly during shutdown. Reported by Dave Jones and
fixed by Eric Dumazet.
2) hyperv and wimax/i2400m return NETDEV_TX_BUSY when they have
already freed the SKB, which causes crashes as to the caller this
means requeue the packet. Fixes from Eric Dumazet.
3) usbnet driver doesn't allocate the right amount of headroom on
fresh RX SKBs, fix from Eric Dumazet.
4) Fix regression in ip6_mc_find_dev_rcu(), as an RCU lookup it
abolutely should not take a reference to 'dev', this leads to
leaks. Fix from RonQing Li.
5) Fix netfilter ctnetlink race between delete and timeout expiration.
From Pablo Neira Ayuso.
6) Revert SFQ change which causes regressions, specifically queueing
to tail can lead to unavoidable flow starvation. From Eric
Dumazet.
7) Fix a memory leak and a crash on corrupt firmware files in bnx2x,
from Michal Schmidt."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
netfilter: ctnetlink: fix race between delete and timeout expiration
ipv6: Don't dev_hold(dev) in ip6_mc_find_dev_rcu.
wimax/i2400m: fix erroneous NETDEV_TX_BUSY use
net/hyperv: fix erroneous NETDEV_TX_BUSY use
net/usbnet: reserve headroom on rx skbs
bnx2x: fix memory leak in bnx2x_init_firmware()
bnx2x: fix a crash on corrupt firmware file
sch_sfq: revert dont put new flow at the end of flows
ipv6: fix icmp6_dst_alloc()
Linus Torvalds [Sat, 17 Mar 2012 16:54:16 +0000 (09:54 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar.
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf tools, x86: Build perf on older user-space as well
perf tools: Use scnprintf where applicable
perf tools: Incorrect use of snprintf results in SEGV
David S. Miller [Sat, 17 Mar 2012 09:02:26 +0000 (02:02 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next
Pablo Neira Ayuso [Fri, 16 Mar 2012 02:00:34 +0000 (02:00 +0000)]
netfilter: ctnetlink: fix race between delete and timeout expiration
Kerin Millar reported hardlockups while running `conntrackd -c'
in a busy firewall. That system (with several processors) was
acting as backup in a primary-backup setup.
After several tries, I found a race condition between the deletion
operation of ctnetlink and timeout expiration. This patch fixes
this problem.
Tested-by: Kerin Millar <kerframil@gmail.com>
Reported-by: Kerin Millar <kerframil@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 8 Feb 2012 07:51:06 +0000 (07:51 +0000)]
ixgbe: always write DMA for single_mapped value with skb
This change makes it so that we always write the DMA address for the skb
itself on the same tx_buffer struct that the skb is written on. This way
we don't need the MAPPED_AS_PAGE flag and we always know it will be the
first DMA value that we will have to unmap.
In addition I have found an issue in which we were leaking a DMA mapping if
the value happened to be 0 which is possible on some platforms. In order
to resolve that I have updated the transmit path to use the length instead
of the DMA mapping in order to determine if a mapping is actually present.
One other tweak in this patch is that it only writes the olinfo information
on the first descriptor. As it turns out it isn't necessary to write it
for anything but the first descriptor so there is no need to carry it
forward.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Richard Cochran [Fri, 16 Mar 2012 22:39:29 +0000 (22:39 +0000)]
phc: Update author's email address.
This commit brings the author email address macros up to date for four
modules in the PTP Hardware Clock subsystem.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 8 Feb 2012 07:51:01 +0000 (07:51 +0000)]
ixgbe: Write gso_segs and bytcount to the ring sooner
This change makes it so that gso_segs and bytecount are written to the ring
sooner. This helps to simplify the logic for the two since segmentation
offloads can now update them within their own function.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Wed, 8 Feb 2012 07:50:56 +0000 (07:50 +0000)]
ixgbe: Place skb on first buffer_info structure to avoid using stack space
Instead of keeping a local copy of the skb on the stack for as long as long
as we do it makes sense to instead just place it on the first tx_buffer
structure so that we can save space on the stack and avoid unnecessary
read/write operations copying the pointer out of the stack and onto the
ring later.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Wed, 8 Feb 2012 07:50:51 +0000 (07:50 +0000)]
ixgbe: Use packets to track Tx completions instead of a seperate value
A separate value was added to track Tx completions in order to determine if
the Tx unit was hung. However we can do the same thing using the number of
packets completed without having to add another stat to the Tx ring.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Wed, 8 Feb 2012 07:50:45 +0000 (07:50 +0000)]
ixgbe: Modify setup of descriptor flags to avoid conditional jumps
This change makes it more likely that the descriptor flags setup will use
cmov instructions instead of conditional jumps when setting up the flags.
The advantage to this is that the code should just flow a bit more
smoothly.
To do this it is necessary to set the TX_FLAGS_CSUM bit in tx_flags when
doing TSO so that we also do the checksum in addition to the segmentation
offload.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Wed, 8 Feb 2012 07:50:40 +0000 (07:50 +0000)]
ixgbe: Make certain that all frames fit minimum size requirements
This change makes certain that any packet we attempt to transmit will meet
minimum size requirements for the hardware.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Wed, 8 Feb 2012 07:50:35 +0000 (07:50 +0000)]
ixgbe: cleanup logic in ixgbe_change_mtu
This change is meant to just cleanup the logic in ixgbe_change_mtu since we
are making it unnecessarily complex due to a workaround required for 82599
when SR-IOV is enabled.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Sat, 3 Mar 2012 02:35:52 +0000 (02:35 +0000)]
ixgbe: Replace standard receive path with a page based receive
This patch replaces the existing Rx hot-path in the ixgbe driver with a new
implementation that is based on performing a double buffered receive. The
ixgbe driver already had something similar in place for its' packet split
path, however in that case we were still receiving the header for the
packet into the sk_buff. The big change here is the entire receive path
will receive into pages only, and then pull the header out of the page and
copy it into the sk_buff data. There are several motivations behind this
approach.
First, this allows us to avoid several cache misses as we were taking a
set of cache misses for allocating the sk_buff and then another set for
receiving data into the sk_buff. We are able to avoid these misses on
receive now as we allocate the sk_buff when data is available.
Second we are able to see a considerable performance gain when an IOMMU is
enabled because we are no longer unmapping every buffer on receive.
Instead we can delay the unmap until we are unable to use the page, and
instead we can simply call sync_single_range on the half of the page that
contains new data.
Finally we are able to drop a considerable amount of code from the driver
as we no longer have to support 2 different receive modes, packet split and
one buffer. This allows us to optimize the Rx path further since less
branching is required.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Ben Greear [Thu, 8 Mar 2012 08:28:41 +0000 (08:28 +0000)]
ixgbe: Support RX-ALL feature flag.
This allows the NIC to receive all frames available, including
those with bad FCS, ethernet control frames, and more.
Tested by sending frames with bad FCS.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Ben Greear [Tue, 6 Mar 2012 09:42:04 +0000 (09:42 +0000)]
ixgbe: Support sending custom Ethernet FCS.
Including bad FCS, used generate frames with bad FCS
to test other system's handling of RX of bad packets.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>