Herbert Xu [Wed, 6 Jul 2005 20:55:21 +0000 (13:55 -0700)]
[CRYPTO] Remove unused iv field from context structure
The iv field in des_ctx/des3_ede_ctx/serpent_ctx has never been used.
This was noticed by Dag Arne Osvik.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andreas Steinmetz [Wed, 6 Jul 2005 20:55:00 +0000 (13:55 -0700)]
[CRYPTO] Add x86_64 asm AES
Implementation:
===============
The encrypt/decrypt code is based on an x86 implementation I did a while
ago which I never published. This unpublished implementation does
include an assembler based key schedule and precomputed tables. For
simplicity and best acceptance, however, I took Gladman's in-kernel code
for table generation and key schedule for the kernel port of my
assembler code and modified this code to produce the key schedule as
required by my assembler implementation. File locations and Kconfig are
kept similar to the i586 AES assembler implementation.
It may seem a little bit strange to use 32 bit I/O and registers in the
assembler implementation but this gives the best code size. My
implementation takes one instruction more per round compared to
Gladman's x86 assembler but it doesn't require any stack for local
variables or saved registers and it is less serialized than Gladman's
code.
Note that all comparisons to Gladman's code were done after my code was
implemented. I did only use FIPS PUB 197 for the implementation so my
implementation is independent work.
If anybody has a better assembler solution for x86_64 I'll be pleased to
have my code replaced with the better solution.
Testing:
========
The implementation passes the in-kernel crypto testing module and I'm
running it without any problems on my laptop where it is mainly used for
dm-crypt.
Microbenchmark:
===============
The microbenchmark was done in userspace with similar compile flags as
used during kernel compile.
Encrypt/decrypt is about 35% faster than the generic C implementation.
As the generic C as well as my assembler implementation are both table
I don't really expect that there is much room for further
improvements though I'll be glad to be corrected here.
The key schedule is about 5% slower than the generic C implementation.
This is due to the fact that some more work has to be done in the key
schedule routine to fit the schedule to the assembler implementation.
Code Size:
==========
Encrypt and decrypt are together about 2.1 Kbytes smaller than the
generic C implementation which is important with regard to L1 cache
usage. The key schedule routine is about 100 bytes larger than the
generic C implementation.
Data Size:
==========
There's no difference in data size requirements between the assembler
implementation and the generic C implementation.
License:
========
Gladmans's code is dual BSD/GPL whereas my assembler code is GPLv2 only
(I'm not going to change the license for my code). So I had to change
the module license for the x86_64 aes module from 'Dual BSD/GPL' to
'GPL' to reflect the most restrictive license within the module.
Signed-off-by: Andreas Steinmetz <ast@domdv.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jesper Juhl [Wed, 6 Jul 2005 20:54:31 +0000 (13:54 -0700)]
[CRYPTO] Add null short circuit to crypto_free_tfm
As far as I'm aware there's a general concensus that functions that are
responsible for freeing resources should be able to cope with being passed
a NULL pointer. This makes sense as it removes the need for all callers to
check for NULL, thus elliminating the bugs that happen when some forget
(safer to just check centrally in the freeing function) and it also makes
for smaller code all over due to the lack of all those NULL checks.
This patch makes it safe to pass the crypto_free_tfm() function a NULL
pointer. Once this patch is applied we can start removing the NULL checks
from the callers.
Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Wed, 6 Jul 2005 20:54:09 +0000 (13:54 -0700)]
[CRYPTO] Update IV correctly for Padlock CBC encryption
When the Padlock does CBC encryption, the memory pointed to by EAX is
not updated at all. Instead, it updates the value of EAX by pointing
it to the last block in the output. Therefore to maintain the correct
semantics we need to copy the IV.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Wed, 6 Jul 2005 20:53:47 +0000 (13:53 -0700)]
[CRYPTO] Handle unaligned iv from encrypt_iv/decrypt_iv
Even though cit_iv is now always aligned, the user can still supply an
unaligned iv through crypto_cipher_encrypt_iv/crypto_cipher_decrypt_iv.
This patch will check the alignment of the user-supplied iv and copy
it if necessary.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Wed, 6 Jul 2005 20:53:29 +0000 (13:53 -0700)]
[CRYPTO] Ensure cit_iv is aligned correctly
This patch ensures that cit_iv is aligned according to cra_alignmask
by allocating it as part of the tfm structure. As a side effect the
crypto layer will also guarantee that the tfm ctx area has enough space
to be aligned by cra_alignmask. This allows us to remove the extra
space reservation from the Padlock driver.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adrian Bunk [Wed, 6 Jul 2005 20:53:09 +0000 (13:53 -0700)]
[CRYPTO] Make crypto_alg_lookup static
This patch makes a needlessly global function static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Wed, 6 Jul 2005 20:52:43 +0000 (13:52 -0700)]
[PADLOCK] Implement multi-block operations
By operating on multiple blocks at once, we expect to extract more
performance out of the VIA Padlock.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Wed, 6 Jul 2005 20:52:27 +0000 (13:52 -0700)]
[PADLOCK] Move fast path work into aes_set_key and upper layer
Most of the work done aes_padlock can be done in aes_set_key. This
means that we only have to do it once when the key changes rather
than every time we perform an encryption or decryption.
This patch also sets cra_alignmask to let the upper layer ensure
that the buffers fed to us are aligned correctly.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Wed, 6 Jul 2005 20:52:09 +0000 (13:52 -0700)]
[CRYPTO] Add alignmask for low-level cipher implementations
The VIA Padlock device requires the input and output buffers to
be aligned on 16-byte boundaries. This patch adds the alignmask
attribute for low-level cipher implementations to indicate their
alignment requirements.
The mid-level crypt() function will copy the input/output buffers
if they are not aligned correctly before they are passed to the
low-level implementation.
Strictly speaking, some of the software implementations require
the buffers to be aligned on 4-byte boundaries as they do 32-bit
loads. However, it is not clear whether it is better to copy
the buffers or pay the penalty for unaligned loads/stores.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Wed, 6 Jul 2005 20:51:52 +0000 (13:51 -0700)]
[CRYPTO] Add support for low-level multi-block operations
This patch adds hooks for cipher algorithms to implement multi-block
ECB/CBC operations directly. This is expected to provide significant
performance boots to the VIA Padlock.
It could also be used for improving software implementations such as
AES where operating on multiple blocks at a time may enable certain
optimisations.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Wed, 6 Jul 2005 20:51:31 +0000 (13:51 -0700)]
[CRYPTO] Add plumbing for multi-block operations
The VIA Padlock device is able to perform much better when multiple
blocks are fed to it at once. As this device offers an exceptional
throughput rate it is worthwhile to optimise the infrastructure
specifically for it.
We shift the existing page-sized fast path down to the CBC/ECB functions.
We can then replace the CBC/ECB functions with functions provided by the
underlying algorithm that performs the multi-block operations.
As a side-effect this improves the performance of large cipher operations
for all existing algorithm implementations. I've measured the gain to be
around 5% for 3DES and 15% for AES.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jesper Juhl [Wed, 6 Jul 2005 20:51:00 +0000 (13:51 -0700)]
[CRYPTO] Don't check for NULL before kfree()
Checking a pointer for NULL before calling kfree() on it is redundant.
This patch removes such checks from crypto/
Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Greg KH [Wed, 6 Jul 2005 15:51:03 +0000 (08:51 -0700)]
[PATCH] Fix bt87x.c build problem
Missing forward declaration
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Greg KH [Wed, 6 Jul 2005 16:09:38 +0000 (09:09 -0700)]
[PATCH] PCI: fix !CONFIG_HOTPLUG pci build problem
Here's a patch to fix the build issue when CONFIG_HOTPLUG is not enabled
in 2.6.13-rc2.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus Torvalds [Wed, 6 Jul 2005 03:46:33 +0000 (20:46 -0700)]
Linux v2.6.13-rc3
Linus Torvalds [Wed, 6 Jul 2005 03:37:09 +0000 (20:37 -0700)]
Merge /pub/scm/linux/kernel/git/davem/sparc-2.6
David S. Miller [Wed, 6 Jul 2005 02:45:24 +0000 (19:45 -0700)]
[SPARC64]: Fix UltraSPARC-III fallout from membar changes.
The membar changes made the size of __cheetah_flush_tlb_pending
grow by one instruction, but the boot-time code patching was
not updated to match.
Signed-off-by: David S. Miller <davem@davemloft.net>
Rusty Lynch [Wed, 6 Jul 2005 01:54:50 +0000 (18:54 -0700)]
[PATCH] kprobes: fix namespace problem and sparc64 build
The following renames arch_init, a kprobes function for performing any
architecture specific initialization, to arch_init_kprobes in order to
cleanup the namespace.
Also, this patch adds arch_init_kprobes to sparc64 to fix the sparc64 kprobes
build from the last return probe patch.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Eugene Surovegin [Wed, 6 Jul 2005 01:54:45 +0000 (18:54 -0700)]
[PATCH] ppc32: explicitly disable 440GP IRQ compatibility mode in 440GX setup
Add explicit disabling of 440GP IRQ compatibility mode when configuring
440GX interrupt controller. This helps when board firmware for some reason
uses this compatibility mode and leaves it enabled. It breaks 440GX
interrupt code because it assumes native 440GX IRQ mode. People seems to
be continuously bitten by this.
Signed-off-by: Eugene Surovegin <ebs@ebshome.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
john stultz [Wed, 6 Jul 2005 01:54:44 +0000 (18:54 -0700)]
[PATCH] ppc32: stop misusing NTP's time_offset value
As part of my timeofday rework, I've been looking at the NTP code and I
noticed that the PPC architecture is apparently misusing the NTP's
time_offset (it is a terrible name!) value as some form of timezone offset.
This could cause problems when time_offset changed by the NTP code. This
patch changes the PPC code so it uses a more clear local variable:
timezone_offset.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Acked-by: Tom Rini <trini@kernel.crashing.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andrei Konovalov [Wed, 6 Jul 2005 01:54:43 +0000 (18:54 -0700)]
[PATCH] ppc32: add Freescale MPC885ADS board support
This patch adds the Freescale MPC86xADS board support. The supported
devices are SMC UART and 10Mbit ethernet on SCC1.
The manual for the board says that it "is compatible with the MPC8xxFADS
for software point of view". That's why this patch extends FADS instead of
introducing a new platform.
FEC is not supported as the "combined FCC/FEC ethernet driver" driver by
Pantelis Antoniou should replace the current FEC driver.
Signed-off-by: Gennadiy Kurtsman <gkurtsman@ru.mvista.com>
Signed-off-by: Andrei Konovalov <akonovalov@ru.mvista.com>
Acked-by: Tom Rini <trini@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus Torvalds [Wed, 6 Jul 2005 01:41:58 +0000 (18:41 -0700)]
Merge /pub/scm/linux/kernel/git/davem/net-2.6
Robert Olsson [Tue, 5 Jul 2005 23:38:26 +0000 (16:38 -0700)]
[IPV4]: Add LC-Trie implementation notes
Signed-off-by: Robert Olsson <Robert.Olsson@data.slu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 5 Jul 2005 22:43:58 +0000 (15:43 -0700)]
[TCP]: Never TSO defer under periods of congestion.
Congestion window recover after loss depends upon the fact
that if we have a full MSS sized frame at the head of the
send queue, we will send it. TSO deferral can defeat the
ACK clocking necessary to exit cleanly from recovery.
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Graf [Tue, 5 Jul 2005 22:29:16 +0000 (15:29 -0700)]
[PKT_SCHED]: Blackhole queueing discipline
Useful in combination with classful qdiscs to drop or
temporary disable certain flows, e.g. one could block
specific ds flows with dsmark.
Unlike the noop qdisc it can be controlled by the user and
statistic accounting is done.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 5 Jul 2005 22:24:38 +0000 (15:24 -0700)]
[TCP]: Move to new TSO segmenting scheme.
Make TSO segment transmit size decisions at send time not earlier.
The basic scheme is that we try to build as large a TSO frame as
possible when pulling in the user data, but the size of the TSO frame
output to the card is determined at transmit time.
This is guided by tp->xmit_size_goal. It is always set to a multiple
of MSS and tells sendmsg/sendpage how large an SKB to try and build.
Later, tcp_write_xmit() and tcp_push_one() chop up the packet if
necessary and conditions warrant. These routines can also decide to
"defer" in order to wait for more ACKs to arrive and thus allow larger
TSO frames to be emitted.
A general observation is that TSO elongates the pipe, thus requiring a
larger congestion window and larger buffering especially at the sender
side. Therefore, it is important that applications 1) get a large
enough socket send buffer (this is accomplished by our dynamic send
buffer expansion code) 2) do large enough writes.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 5 Jul 2005 22:21:10 +0000 (15:21 -0700)]
[TCP]: Break out send buffer expansion test.
This makes it easier to understand, and allows easier
tweaking of the heuristic later on.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 5 Jul 2005 22:20:55 +0000 (15:20 -0700)]
[TCP]: Do not call tcp_tso_acked() if no work to do.
In tcp_clean_rtx_queue(), if the TSO packet is not even partially
acked, do not waste time calling tcp_tso_acked().
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 5 Jul 2005 22:20:41 +0000 (15:20 -0700)]
[TCP]: Kill bogus comment above tcp_tso_acked().
Everything stated there is out of data. tcp_trim_skb()
does adjust the available socket send buffer space and
skb->truesize now.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 5 Jul 2005 22:20:27 +0000 (15:20 -0700)]
[TCP]: Fix send-side cpu utiliziation regression.
Only put user data purely to pages when doing TSO.
The extra page allocations cause two problems:
1) Add the overhead of the page allocations themselves.
2) Make us do small user copies when we get to the end
of the TCP socket cache page.
It is still beneficial to purely use pages for TSO,
so we will do it for that case.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 5 Jul 2005 22:20:09 +0000 (15:20 -0700)]
[TCP]: Eliminate redundant computations in tcp_write_xmit().
tcp_snd_test() is run for every packet output by a single
call to tcp_write_xmit(), but this is not necessary.
For one, the congestion window space needs to only be
calculated one time, then used throughout the duration
of the loop.
This cleanup also makes experimenting with different TSO
packetization schemes much easier.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 5 Jul 2005 22:19:54 +0000 (15:19 -0700)]
[TCP]: Break out tcp_snd_test() into it's constituent parts.
tcp_snd_test() does several different things, use inline
functions to express this more clearly.
1) It initializes the TSO count of SKB, if necessary.
2) It performs the Nagle test.
3) It makes sure the congestion window is adhered to.
4) It makes sure SKB fits into the send window.
This cleanup also sets things up so that things like the
available packets in the congestion window does not need
to be calculated multiple times by packet sending loops
such as tcp_write_xmit().
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 5 Jul 2005 22:19:38 +0000 (15:19 -0700)]
[TCP]: Fix __tcp_push_pending_frames() 'nonagle' handling.
'nonagle' should be passed to the tcp_snd_test() function
as 'TCP_NAGLE_PUSH' if we are checking an SKB not at the
tail of the write_queue. This is because Nagle does not
apply to such frames since we cannot possibly tack more
data onto them.
However, while doing this __tcp_push_pending_frames() makes
all of the packets in the write_queue use this modified
'nonagle' value.
Fix the bug and simplify this function by just calling
tcp_write_xmit() directly if sk_send_head is non-NULL.
As a result, we can now make tcp_data_snd_check() just call
tcp_push_pending_frames() instead of the specialized
__tcp_data_snd_check().
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 5 Jul 2005 22:19:23 +0000 (15:19 -0700)]
[TCP]: Fix redundant calculations of tcp_current_mss()
tcp_write_xmit() uses tcp_current_mss(), but some of it's callers,
namely __tcp_push_pending_frames(), already has this value available
already.
While we're here, fix the "cur_mss" argument to be "unsigned int"
instead of plain "unsigned".
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 5 Jul 2005 22:19:06 +0000 (15:19 -0700)]
[TCP]: tcp_write_xmit() tabbing cleanup
Put the main basic block of work at the top-level of
tabbing, and mark the TCP_CLOSE test with unlikely().
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 5 Jul 2005 22:18:51 +0000 (15:18 -0700)]
[TCP]: Kill extra cwnd validate in __tcp_push_pending_frames().
The tcp_cwnd_validate() function should only be invoked
if we actually send some frames, yet __tcp_push_pending_frames()
will always invoke it. tcp_write_xmit() does the call for us,
so the call here can simply be removed.
Also, tcp_write_xmit() can be marked static.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 5 Jul 2005 22:18:34 +0000 (15:18 -0700)]
[TCP]: Add missing skb_header_release() call to tcp_fragment().
When we add any new packet to the TCP socket write queue,
we must call skb_header_release() on it in order for the
TSO sharing checks in the drivers to work.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 5 Jul 2005 22:18:18 +0000 (15:18 -0700)]
[TCP]: Move __tcp_data_snd_check into tcp_output.c
It reimplements portions of tcp_snd_check(), so it
we move it to tcp_output.c we can consolidate it's
logic much easier in a later change.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 5 Jul 2005 22:18:03 +0000 (15:18 -0700)]
[TCP]: Move send test logic out of net/tcp.h
This just moves the code into tcp_output.c, no code logic changes are
made by this patch.
Using this as a baseline, we can begin to untangle the mess of
comparisons for the Nagle test et al. We will also be able to reduce
all of the redundant computation that occurs when outputting data
packets.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 5 Jul 2005 22:17:45 +0000 (15:17 -0700)]
[TCP]: Fix quick-ack decrementing with TSO.
On each packet output, we call tcp_dec_quickack_mode()
if the ACK flag is set. It drops tp->ack.quick until
it hits zero, at which time we deflate the ATO value.
When doing TSO, we are emitting multiple packets with
ACK set, so we should decrement tp->ack.quick that many
segments.
Note that, unlike this case, tcp_enter_cwr() should not
take the tcp_skb_pcount(skb) into consideration. That
function, one time, readjusts tp->snd_cwnd and moves
into TCP_CA_CWR state.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 5 Jul 2005 22:17:25 +0000 (15:17 -0700)]
[TCP]: Simplify SKB data portion allocation with NETIF_F_SG.
The ideal and most optimal layout for an SKB when doing
scatter-gather is to put all the headers at skb->data, and
all the user data in the page array.
This makes SKB splitting and combining extremely simple,
especially before a packet goes onto the wire the first
time.
So, when sk_stream_alloc_pskb() is given a zero size, make
sure there is no skb_tailroom(). This is achieved by applying
SKB_DATA_ALIGN() to the header length used here.
Next, make select_size() in TCP output segmentation use a
length of zero when NETIF_F_SG is true on the outgoing
interface.
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Dobriyan [Tue, 5 Jul 2005 22:12:04 +0000 (15:12 -0700)]
[NET]: Remove __ARGS from include/net/slhc_vj.h
I suspect "#define __ARGS(x) ()" was deprecated before I was born.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Chau [Tue, 5 Jul 2005 22:11:06 +0000 (15:11 -0700)]
[NET]: improve readability of dev_set_promiscuity() in net/core/dev.c
A trivial patch to improve the readability of dev_set_promiscuity()
in net/core/dev.c. New code does exactly the same thing as original
code.
Signed-off-by: David Chau <ddcc@mit.edu>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christoph Hellwig [Tue, 5 Jul 2005 22:03:46 +0000 (15:03 -0700)]
[SHAPER]: Switch to spinlocks.
Dave, you were right and the sleeping locks in shaper were
broken. Markus Kanet noticed this and also tested the patch below that
switches locking to spinlocks.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Robert Olsson [Tue, 5 Jul 2005 22:02:40 +0000 (15:02 -0700)]
[IPV4]: More broken memory allocation fixes for fib_trie
Below a patch to preallocate memory when doing resize of trie (inflate halve)
If preallocations fails it just skips the resize of this tnode for this time.
The oops we got when killing bgpd (with full routing) is now gone.
Patrick memory patch is also used.
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Graf [Tue, 5 Jul 2005 22:01:25 +0000 (15:01 -0700)]
[DECNET]: Fix memset overflow on 64bit archs while dumping decnet routing rules
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 5 Jul 2005 22:00:32 +0000 (15:00 -0700)]
[IPV4]: Bug fix in rt_check_expire()
- rt_check_expire() fixes (an overflow occured if size of the hash
was >= 65536)
reminder of the bugfix:
The rt_check_expire() has a serious problem on machines with large
route caches, and a standard HZ value of 1000.
With default values, ie ip_rt_gc_interval = 60*HZ = 60000 ;
the loop count :
for (t = ip_rt_gc_interval << rt_hash_log; t >= 0;
overflows (t is a 31 bit value) as soon rt_hash_log is >= 16 (65536
slots in route cache hash table).
In this case, rt_check_expire() does nothing at all
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 5 Jul 2005 21:58:19 +0000 (14:58 -0700)]
[IPV4]: Use the fancy alloc_large_system_hash() function for route hash table
- rt hash table allocated using alloc_large_system_hash() function.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 5 Jul 2005 21:55:24 +0000 (14:55 -0700)]
[NET]: Hashed spinlocks in net/ipv4/route.c
- Locking abstraction
- Spinlocks moved out of rt hash table : Less memory (50%) used by rt
hash table. it's a win even on UP.
- Sizing of spinlocks table depends on NR_CPUS
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Tue, 5 Jul 2005 21:44:55 +0000 (14:44 -0700)]
[IPV4]: Handle large allocations in fib_trie
Inflating a node a couple of times makes it exceed the 128k kmalloc limit.
Use __get_free_pages for allocations > PAGE_SIZE, as in fib_hash.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Robert Olsson <Robert.Olsson@data.slu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 5 Jul 2005 21:43:19 +0000 (14:43 -0700)]
[TG3]: Update driver version and reldate.
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Tue, 5 Jul 2005 21:42:33 +0000 (14:42 -0700)]
[TG3]: support for ethtool -C
Add support for ethtool -C with verification of user parameters.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Tue, 5 Jul 2005 21:41:20 +0000 (14:41 -0700)]
[IPV6]: Makes IPv6 rcv registration happen last during initialisation.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Tue, 5 Jul 2005 21:40:10 +0000 (14:40 -0700)]
[IPV4]: Fix crash in ip_rcv while booting related to netconsole
Makes IPv4 ip_rcv registration happen last in af_inet.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 5 Jul 2005 21:24:35 +0000 (14:24 -0700)]
[SKGE]: Fix build on big-endian
Missing PCI_REV_DESC define.
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 5 Jul 2005 21:17:40 +0000 (14:17 -0700)]
Merge /pub/scm/linux/kernel/git/davem/sparc-2.6
Thomas Graf [Tue, 5 Jul 2005 21:15:53 +0000 (14:15 -0700)]
[PKT_SCHED]: Report rate estimator configuration errors during qdisc allocation
Current behaviour is to not report an error if a rate
estimator is created together with a qdisc and the
configuration of the rate estimator is bogus. This leads
to unexpected behaviour because the user is not notified.
New behaviour is to report the error and let the whole
qdisc creation operation fail so the user is able to fix
his mistake.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Graf [Tue, 5 Jul 2005 21:15:09 +0000 (14:15 -0700)]
[PKT_SCHED]: Cleanup qdisc creation and alignment macros
Adds qdisc_alloc() to share code between qdisc_create()
and qdisc_create_dflt(). Hides the qdisc alignment behind
macros and makes use of them.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Graf [Tue, 5 Jul 2005 21:14:30 +0000 (14:14 -0700)]
[PKT_SCHED]: Move sch_generic.c prototypes to correct header file
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Graf [Tue, 5 Jul 2005 21:13:41 +0000 (14:13 -0700)]
[NET]: Reduce size of sk_buff by 4 bytes
Reduce local_df to a bit field and ip_summed to a 2 bits
field thus saving 13 bits. Move bit fields, packet type,
and protocol into the spare area between the priority
and the destructor. Saves 4 bytes on both, 32bit and
64bit architectures.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Graf [Tue, 5 Jul 2005 21:12:44 +0000 (14:12 -0700)]
[NET]: Remove unused security member in sk_buff
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Tue, 5 Jul 2005 21:10:40 +0000 (14:10 -0700)]
[NET]: net/core/filter.c: make len cover the entire packet
As suggested by Herbert Xu:
Since we don't require anything to be in the linear packet range
anymore make len cover the entire packet.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Tue, 5 Jul 2005 21:10:21 +0000 (14:10 -0700)]
[NET]: Consolidate common code in net/core/filter.c
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Tue, 5 Jul 2005 21:08:57 +0000 (14:08 -0700)]
[NET]: Remove redundant code in net/core/filter.c
skb_header_pointer handles linear and non-linear data, no need to handle
linear data again.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Tue, 5 Jul 2005 21:08:10 +0000 (14:08 -0700)]
[NET]: Fix signedness issues in net/core/filter.c
This is the code to load packet data into a register:
k = fentry->k;
if (k < 0) {
...
} else {
u32 _tmp, *p;
p = skb_header_pointer(skb, k, 4, &_tmp);
if (p != NULL) {
A = ntohl(*p);
continue;
}
}
skb_header_pointer checks if the requested data is within the
linear area:
int hlen = skb_headlen(skb);
if (offset + len <= hlen)
return skb->data + offset;
When offset is within [INT_MAX-len+1..INT_MAX] the addition will
result in a negative number which is <= hlen.
I couldn't trigger a crash on my AMD64 with 2GB of memory, but a
coworker tried on his x86 machine and it crashed immediately.
This patch fixes the check in skb_header_pointer to handle large
positive offsets similar to skb_copy_bits. Invalid data can still
be accessed using negative offsets (also similar to skb_copy_bits),
anyone using negative offsets needs to verify them himself.
Thanks to Thomas Vögtle <thomas.voegtle@coreworks.de> for verifying the
problem by crashing his machine and providing me with an Oops.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 5 Jul 2005 18:35:58 +0000 (11:35 -0700)]
Merge /pub/scm/linux/kernel/git/gregkh/pci-2.6
David S. Miller [Mon, 4 Jul 2005 22:58:19 +0000 (15:58 -0700)]
[SPARC64]: Fix IRQ retry interval timer value on sparc64 PCI controllers.
Use '5' instead of 'infinity'.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 4 Jul 2005 21:53:33 +0000 (14:53 -0700)]
[SPARC64]: Small Schizo PCI controller programming tweaks.
Use macro instead of magic value for Tomatillo discard-
timeout interrupt enable register bit.
Leave OBP programming PTO value unless Tomatillo and
version >= 0x2.
If no-bus-parking property is present, explicitly clear
PCICTRL_PARK bit.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 4 Jul 2005 20:26:04 +0000 (13:26 -0700)]
[SPARC64]: Do proper DMA IRQ syncing on Tomatillo
This was the main impetus behind adding the PCI IRQ shim.
In order to properly order DMA writes wrt. interrupts, you have to
write to a PCI controller register, then poll for that bit clearing.
There is one bit for each interrupt source, and setting this register
bit tells Tomatillo to drain all pending DMA from that device.
Furthermore, Tomatillo's with revision less than 4 require us to do a
block store due to some memory transaction ordering issues it has on
JBUS.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 4 Jul 2005 20:24:38 +0000 (13:24 -0700)]
[SPARC64]: Add support for IRQ pre-handlers.
This allows a PCI controller to shim into IRQ delivery
so that DMA queues can be drained, if necessary.
If some bus specific code needs to run before an IRQ
handler is invoked, the bus driver simply needs to setup
the function pointer in bucket->irq_info->pre_handler and
the two args bucket->irq_info->pre_handler_arg[12].
The Schizo PCI driver is converted over to use a pre-handler
for the DMA write-sync processing it needs when a device
is behind a PCI->PCI bus deeper than the top-level APB
bridges.
While we're here, clean up all of the action allocation
and handling. Now, we allocate the irqaction as part of
the bucket->irq_info area. There is an array of 4 irqaction
(for PCI irq sharing) and a bitmask saying which entries
are active.
The bucket->irq_info is allocated at build_irq() time, not
at request_irq() time. This simplifies request_irq() and
free_irq() tremendously.
The SMP dynamic IRQ retargetting code got removed in this
change too. It was disabled for a few months now, and we
can resurrect it in the future if we want.
Signed-off-by: David S. Miller <davem@davemloft.net>
Christoph Hellwig [Mon, 4 Jul 2005 20:24:14 +0000 (13:24 -0700)]
[SPARC]: bpp: remove sleep_on usage
Use schedule_timeout() instead of sleep_on + timer. Totally untested
due to lack of hardware, but the changes are rather trivial.
Signed-off-by: David S. Miller <davem@davemloft.net>
Raphael Assenat [Mon, 4 Jul 2005 20:23:45 +0000 (13:23 -0700)]
[SPARC64/COMPAT]: Add some compat ioctl for ppdev
The following patch adds some ioctls to include/linux/compat_ioctl.h
to allow using ppdev from the 32 bit user space on sparc64.
This patch also adds the PPDEV option in the sparc64 menu, near Parallel
printer support in the 'General machine setup' submenu.
All those ioctls seem to be compatible, since (correct me if I'm wrong)
they dont use the 'long' type. See include/linux/ppdev.h.
The application I used to test the new ioctls only used the following:
PPEXCL
PPCLAIM
PPNEGOT
PPGETMODES
PPRCONTROL
PPWCONTROL
PPDATADIR
PPWDATA
PPRDATA
But I beleive that the other ioctls will work fine.
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Mon, 4 Jul 2005 12:02:46 +0000 (13:02 +0100)]
[PATCH] ARM: Fix new-ABI layout of struct stat64
Add __attribute__((packed)) to ensure that the stat64 structure is
correctly laid out no matter which ABI the kernel is compiled for.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Mon, 4 Jul 2005 09:44:34 +0000 (10:44 +0100)]
[PATCH] ARM: Fix non-standard PXA io_pg_offst initialisers
These didn't match my sed expression correctly, fix them up manually.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Mon, 4 Jul 2005 09:43:36 +0000 (10:43 +0100)]
[PATCH] ARM: Change 'param_offset' to 'boot_params'
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Linus Torvalds [Sun, 3 Jul 2005 21:39:33 +0000 (14:39 -0700)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm
Linus Torvalds [Sun, 3 Jul 2005 21:37:09 +0000 (14:37 -0700)]
Merge master.kernel.org:/home/rmk/linux-2.6-serial
Russell King [Sun, 3 Jul 2005 20:05:45 +0000 (21:05 +0100)]
[PATCH] Serial: Fix console port spinlock initialisation
Initialise the spinlock for port being used by the console early, but
don't re-initialise it again later.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Catalin Marinas [Sun, 3 Jul 2005 16:53:25 +0000 (17:53 +0100)]
[PATCH] ARM: 2784/1: Fix the block cache flush operation range
Patch from Catalin Marinas
The range for the ARMv6 block cache operations is inclusive but the
kernel doesn't re-calculate the end address, causing a page fault when
used (this only happens with support for cache aliasing, otherwise the
blk_flush_kern_dcache_page() is not called). This patch subtracts
L1_CACHE_BYTES from the end address.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Ben Dooks [Sun, 3 Jul 2005 16:44:40 +0000 (17:44 +0100)]
[PATCH] ARM: 2785/1: S3C24XX - serial calls request_irq() with IRQs disabled
Patch from Ben Dooks
The request_irq() function is called by s3c24xx uart driver with
the local IRQs disabled. The request_irq() function can allocate
memory via kmalloc(), and this may sleep causing a warning about
sleeping in an invalid context.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Sun, 3 Jul 2005 16:38:58 +0000 (17:38 +0100)]
[PATCH] ARM: Remove machine description macros
Remove the pointless machine description macros, favouring C99
initialisers instead.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Adrian Bunk [Sun, 3 Jul 2005 15:44:10 +0000 (17:44 +0200)]
[PATCH] drivers/ide/Makefile: kill dead CONFIG_BLK_DEV_IDE_TCQ entry
This patch kills the dead CONFIG_BLK_DEV_IDE_TCQ entry.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@elka.pw.edu.pl>
Rob Punkunus [Sun, 3 Jul 2005 15:37:18 +0000 (17:37 +0200)]
[PATCH] amd74xx: support MCP55 device IDs
From: Rob Punkunus <rpunkunus@nvidia.com>
Rob Punkunus recently submitted a patch to enable support for MCP51/MCP55 in
the amd74xx driver. This patch was whitespace-corrupted and didn't apply to
2.6.12 since MCP51 support was merged in the 2.6.12-rc series.
Gentoo would like to support this hardware for our upcoming release media, so
I fixed the patch, and here it is :)
Signed-off-by: Daniel Drake <dsd@gentoo.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@elka.pw.edu.pl>
Denis Vlasenko [Sun, 3 Jul 2005 15:09:13 +0000 (17:09 +0200)]
[PATCH] ide: fix line break in ide messages
From: Denis Vlasenko <vda@ilport.com.ua>
* printk("\n") is misplaced, resulting in stray empty line in kernel log
* cleanups nerby: some back-to-back printks are combined, etc
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@elka.pw.edu.pl>
Herbert Xu [Sun, 3 Jul 2005 14:42:18 +0000 (16:42 +0200)]
[PATCH] ide: hotplug mark __devinit via82cxxx.c
From: Herbert Xu <herbert@gondor.apana.org.au>
mark the __init section __devinit.
Splitted up from the Debian kernel patch.
Signed-off-by: maximilian attems <janitor@sternwelten.at>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@elka.pw.edu.pl>
Herbert Xu [Sun, 3 Jul 2005 14:40:31 +0000 (16:40 +0200)]
[PATCH] ide: hotplug mark __devinit triflex.c
From: Herbert Xu <herbert@gondor.apana.org.au>
mark the __init section __devinit.
Splitted up from the Debian kernel patch.
Signed-off-by: maximilian attems <janitor@sternwelten.at>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@elka.pw.edu.pl>
Herbert Xu [Sun, 3 Jul 2005 14:38:51 +0000 (16:38 +0200)]
[PATCH] ide: hotplug mark __devinit slc90e66.c
From: Herbert Xu <herbert@gondor.apana.org.au>
mark the __init section __devinit.
Splitted up from the Debian kernel patch.
Signed-off-by: maximilian attems <janitor@sternwelten.at>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@elka.pw.edu.pl>
Herbert Xu [Sun, 3 Jul 2005 14:36:56 +0000 (16:36 +0200)]
[PATCH] ide: hotplug mark __devinit sl82c105.c
From: Herbert Xu <herbert@gondor.apana.org.au>
mark the __init section __devinit.
Splitted up from the Debian kernel patch.
Signed-off-by: maximilian attems <janitor@sternwelten.at>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@elka.pw.edu.pl>
Herbert Xu [Sun, 3 Jul 2005 14:35:07 +0000 (16:35 +0200)]
[PATCH] ide: hotplug mark __devinit sc1200.c
From: Herbert Xu <herbert@gondor.apana.org.au>
mark the __init section __devinit.
Splitted up from the Debian kernel patch.
Signed-off-by: maximilian attems <janitor@sternwelten.at>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@elka.pw.edu.pl>
Herbert Xu [Sun, 3 Jul 2005 14:33:16 +0000 (16:33 +0200)]
[PATCH] ide: hotplug mark __devinit opti621.c
From: Herbert Xu <herbert@gondor.apana.org.au>
mark the __init section __devinit.
Splitted up from the Debian kernel patch.
Signed-off-by: maximilian attems <janitor@sternwelten.at>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@elka.pw.edu.pl>
Herbert Xu [Sun, 3 Jul 2005 14:31:04 +0000 (16:31 +0200)]
[PATCH] ide: hotplug mark __devinit ns87415.c
From: Herbert Xu <herbert@gondor.apana.org.au>
mark the __init section __devinit.
Splitted up from the Debian kernel patch.
Signed-off-by: maximilian attems <janitor@sternwelten.at>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@elka.pw.edu.pl>
Herbert Xu [Sun, 3 Jul 2005 14:28:44 +0000 (16:28 +0200)]
[PATCH] ide: hotplug mark __devinit it8172.c
From: Herbert Xu <herbert@gondor.apana.org.au>
mark the __init section __devinit.
Splitted up from the Debian kernel patch.
Signed-off-by: maximilian attems <janitor@sternwelten.at>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@elka.pw.edu.pl>
Herbert Xu [Sun, 3 Jul 2005 14:25:46 +0000 (16:25 +0200)]
[PATCH] ide: hotplug mark __devinit cy82c693.c
From: Herbert Xu <herbert@gondor.apana.org.au>
mark the __init section __devinit.
Splitted up from the Debian kernel patch.
Signed-off-by: maximilian attems <janitor@sternwelten.at>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@elka.pw.edu.pl>
Herbert Xu [Sun, 3 Jul 2005 14:23:08 +0000 (16:23 +0200)]
[PATCH] ide: hotplug mark __devinit cs5530.c
From: Herbert Xu <herbert@gondor.apana.org.au>
mark the __init section __devinit.
Splitted up from the Debian kernel patch.
Signed-off-by: maximilian attems <janitor@sternwelten.at>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@elka.pw.edu.pl>
Herbert Xu [Sun, 3 Jul 2005 14:15:41 +0000 (16:15 +0200)]
[PATCH] ide: hotplug mark __devinit amd74xx.c
From: Herbert Xu <herbert@gondor.apana.org.au>
mark the __init section __devinit.
Splitted up from the Debian kernel patch.
Signed-off-by: maximilian attems <janitor@sternwelten.at>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@elka.pw.edu.pl>
Herbert Xu [Sun, 3 Jul 2005 14:06:13 +0000 (16:06 +0200)]
[PATCH] ide: hotplug mark __devinit alim15x3.c
From: Herbert Xu <herbert@gondor.apana.org.au>
mark the __init section __devinit.
Splitted up from the Debian kernel patch.
see the thread about the pci hotplug crash on a stratus box.
http://marc.theaimsgroup.com/?l=linux-kernel&m=
111930108613386&w=2
Signed-off-by: maximilian attems <janitor@sternwelten.at>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@elka.pw.edu.pl>
Linus Torvalds [Sat, 2 Jul 2005 17:39:09 +0000 (10:39 -0700)]
Merge master.kernel.org:/home/rmk/linux-2.6-mmc
Linus Torvalds [Sat, 2 Jul 2005 17:37:50 +0000 (10:37 -0700)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm
Linus Torvalds [Sat, 2 Jul 2005 17:35:33 +0000 (10:35 -0700)]
If ACPI doesn't find an irq listed, don't accept 0 as a valid PCI irq.
That zero just means that nothing else found any irq information either.