Jan Engelhardt [Wed, 8 Oct 2008 09:35:19 +0000 (11:35 +0200)]
netfilter: xtables: move extension arguments into compound structure (5/6)
This patch does this for target extensions' checkentry functions.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:19 +0000 (11:35 +0200)]
netfilter: xtables: move extension arguments into compound structure (4/6)
This patch does this for target extensions' target functions.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:19 +0000 (11:35 +0200)]
netfilter: xtables: move extension arguments into compound structure (3/6)
This patch does this for match extensions' destroy functions.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:18 +0000 (11:35 +0200)]
netfilter: xtables: move extension arguments into compound structure (2/6)
This patch does this for match extensions' checkentry functions.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:18 +0000 (11:35 +0200)]
netfilter: xtables: move extension arguments into compound structure (1/6)
The function signatures for Xtables extensions have grown over time.
It involves a lot of typing/replication, and also a bit of stack space
even if they are not used. Realize an NFWS2008 idea and pack them into
structs. The skb remains outside of the struct so gcc can continue to
apply its optimizations.
This patch does this for match extensions' match functions.
A few ambiguities have also been addressed. The "offset" parameter for
example has been renamed to "fragoff" (there are so many different
offsets already) and "protoff" to "thoff" (there is more than just one
protocol here, so clarify).
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:18 +0000 (11:35 +0200)]
netfilter: xtables: use "if" blocks in Kconfig
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:17 +0000 (11:35 +0200)]
netfilter: xtables: sort extensions alphabetically in Kconfig
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:17 +0000 (11:35 +0200)]
netfilter: ebtables: make BRIDGE_NF_EBTABLES a menuconfig option
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:17 +0000 (11:35 +0200)]
netfilter: ip6tables: fix Kconfig entry dependency for ip6t_LOG
ip6t_LOG does certainly not depend on the filter table.
(Also, move it so that menuconfig still displays it correctly.)
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:17 +0000 (11:35 +0200)]
netfilter: ip6tables: fix name of hopbyhop in Kconfig
The module is called hbh, not hopbyhop.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:17 +0000 (11:35 +0200)]
netfilter: xtables: do centralized checkentry call (1/2)
It used to be that {ip,ip6,etc}_tables called extension->checkentry
themselves, but this can be moved into the xtables core.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:16 +0000 (11:35 +0200)]
netfilter: ebtables: fix one wrong return value
Usually -EINVAL is used when checkentry fails (see *_tables).
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:16 +0000 (11:35 +0200)]
netfilter: remove redundant casts from Ebtables
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:16 +0000 (11:35 +0200)]
netfilter: remove unused Ebtables functions
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:16 +0000 (11:35 +0200)]
netfilter: implement hotdrop for Ebtables
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:15 +0000 (11:35 +0200)]
netfilter: ebtables: use generic table checking
Ebtables ORs (1 << NF_BR_NUMHOOKS) into the hook mask to indicate that
the extension was called from a base chain. So this also needs to be
present in the extensions' ->hooks.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:15 +0000 (11:35 +0200)]
netfilter: x_tables: output bad hook mask in hexadecimal
It is a mask, and masks are most useful in hex.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:15 +0000 (11:35 +0200)]
netfilter: move Ebtables to use Xtables
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:15 +0000 (11:35 +0200)]
netfilter: change Ebtables function signatures to match Xtables's
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:14 +0000 (11:35 +0200)]
netfilter: ebt_among: obtain match size through different means
The function signatures will be changed to match those of Xtables, and
the datalen argument will be gone. ebt_among unfortunately relies on
it, so we need to obtain it somehow.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:14 +0000 (11:35 +0200)]
netfilter: add dummy members to Ebtables code to ease transition to Xtables
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:13 +0000 (11:35 +0200)]
netfilter: Change return types of targets/watchers for Ebtables extensions
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:13 +0000 (11:35 +0200)]
netfilter: change return types of match functions for ebtables extensions
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:13 +0000 (11:35 +0200)]
netfilter: change return types of check functions for Ebtables extensions
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:13 +0000 (11:35 +0200)]
netfilter: ebtables: do centralized size checking
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
KOVACS Krisztian [Wed, 8 Oct 2008 09:35:12 +0000 (11:35 +0200)]
netfilter: Add documentation for tproxy
Add basic usage instructions to Documentation/networking.
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
KOVACS Krisztian [Wed, 8 Oct 2008 09:35:12 +0000 (11:35 +0200)]
netfilter: iptables TPROXY target
The TPROXY target implements redirection of non-local TCP/UDP traffic to local
sockets. Additionally, it's possible to manipulate the packet mark if and only
if a socket has been found. (We need this because we cannot use multiple
targets in the same iptables rule.)
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
KOVACS Krisztian [Wed, 8 Oct 2008 09:35:12 +0000 (11:35 +0200)]
netfilter: iptables socket match
Add iptables 'socket' match, which matches packets for which a TCP/UDP
socket lookup succeeds.
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
KOVACS Krisztian [Wed, 8 Oct 2008 09:35:12 +0000 (11:35 +0200)]
netfilter: iptables tproxy core
The iptables tproxy core is a module that contains the common routines used by
various tproxy related modules (TPROXY target and socket match)
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
KOVACS Krisztian [Wed, 8 Oct 2008 09:35:12 +0000 (11:35 +0200)]
netfilter: split netfilter IPv4 defragmentation into a separate module
Netfilter connection tracking requires all IPv4 packets to be defragmented.
Both the socket match and the TPROXY target depend on this functionality, so
this patch separates the Netfilter IPv4 defrag hooks into a separate module.
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:11 +0000 (11:35 +0200)]
netfilter: enable netfilter in netns
From kernel perspective, allow entrance in nf_hook_slow().
Stuff which uses nf_register_hook/nf_register_hooks, but otherwise not netns-ready:
DECnet netfilter
ipt_CLUSTERIP
nf_nat_standalone.c together with XFRM (?)
IPVS
several individual match modules (like hashlimit)
ctnetlink
NOTRACK
all sorts of queueing and reporting to userspace
L3 and L4 protocol sysctls, bridge sysctls
probably something else
Anyway critical mass has been achieved, there is no reason to hide netfilter any longer.
From userspace perspective, allow to manipulate all sorts of
iptables/ip6tables/arptables rules.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:11 +0000 (11:35 +0200)]
netfilter: netns nat: PPTP NAT in netns
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:11 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: fixup DNAT in netns
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:11 +0000 (11:35 +0200)]
netfilter: netns nat: per-netns bysource hash
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:10 +0000 (11:35 +0200)]
netfilter: netns nat: per-netns NAT table
Same story as with iptable_filter, iptables_raw tables.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:10 +0000 (11:35 +0200)]
netfilter: netns nat: fix ipt_MASQUERADE in netns
First, allow entry in notifier hook.
Second, start conntrack cleanup in netns to which netdevice belongs.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:10 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: PPTP conntracking in netns
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:10 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: GRE conntracking in netns
* make keymap list per-netns
* per-netns keymal lock (not strictly necessary)
* flush keymap at netns stop and module unload.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:09 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: H323 conntracking in netns
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:09 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: SIP conntracking in netns
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:09 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: final netns tweaks
Add init_net checks to not remove kmem_caches twice and so on.
Refactor functions to split code which should be executed only for
init_net into one place.
ip_ct_attach and ip_ct_destroy assignments remain separate, because
they're separate stages in setup and teardown.
NOTE: NOTRACK code is in for-every-net part. It will be made per-netns
after we decidce how to do it correctly.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:09 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns conntrack accounting
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:08 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns net.netfilter.nf_conntrack_log_invalid sysctl
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:08 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns net.netfilter.nf_conntrack_checksum sysctl
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:08 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns net.netfilter.nf_conntrack_count sysctl
Note, sysctl table is always duplicated, this is simpler and less
special-cased.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:08 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns /proc/net/stat/nf_conntrack, /proc/net/stat/ip_conntrack
Show correct conntrack count, while I'm at it.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:07 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns statistics
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:07 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns event cache
Heh, last minute proof-reading of this patch made me think,
that this is actually unneeded, simply because "ct" pointers will be
different for different conntracks in different netns, just like they
are different in one netns.
Not so sure anymore.
[Patrick: pointers will be different, flushing can only be done while
inactive though and thus it needs to be per netns]
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:07 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: pass conntrack to nf_conntrack_event_cache() not skb
This is cleaner, we already know conntrack to which event is relevant.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:07 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: cleanup after L3 and L4 proto unregister in every netns
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:06 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: unregister helper in every netns
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:06 +0000 (11:35 +0200)]
netns: export netns list
Conntrack code will use it for
a) removing expectations and helpers when corresponding module is removed, and
b) removing conntracks when L3 protocol conntrack module is removed.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:06 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns /proc/net/ip_conntrack, /proc/net/stat/ip_conntrack, /proc/net/ip_conntrack_expect
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:06 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns /proc/net/nf_conntrack_expect
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:05 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns /proc/net/nf_conntrack, /proc/net/stat/nf_conntrack
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:05 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: pass netns pointer to L4 protocol's ->error hook
Again, it's deducible from skb, but we're going to use it for
nf_conntrack_checksum and statistics, so just pass it from upper layer.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:04 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: pass netns pointer to nf_conntrack_in()
It's deducible from skb->dev or skb->dst->dev, but we know netns at
the moment of call, so pass it down and use for finding and creating
conntracks.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:04 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns unconfirmed list
What is confirmed connection in one netns can very well be unconfirmed
in another one.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:03 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns expectations
Make per-netns a) expectation hash and b) expectations count.
Expectations always belongs to netns to which it's master conntrack belong.
This is natural and doesn't bloat expectation.
Proc files and leaf users are stubbed to init_net, this is temporary.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:03 +0000 (11:35 +0200)]
netfilter: netns: fix {ip,6}_route_me_harder() in netns
Take netns from skb->dst->dev. It should be safe because, they are called
from LOCAL_OUT hook where dst is valid (though, I'm not exactly sure about
IPVS and queueing packets to userspace).
[Patrick: its safe everywhere since they already expect skb->dst to be set]
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:03 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns conntrack hash
* make per-netns conntrack hash
Other solution is to add ->ct_net pointer to tuplehashes and still has one
hash, I tried that it's ugly and requires more code deep down in protocol
modules et al.
* propagate netns pointer to where needed, e. g. to conntrack iterators.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:03 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: per-netns conntrack count
Sysctls and proc files are stubbed to init_net's one. This is temporary.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:02 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: add ->ct_net -- pointer from conntrack to netns
Conntrack (struct nf_conn) gets pointer to netns: ->ct_net -- netns in which
it was created. It comes from netdevice.
->ct_net is write-once field.
Every conntrack in system has ->ct_net initialized, no exceptions.
->ct_net doesn't pin netns: conntracks are recycled after timeouts and
pinning background traffic will prevent netns from even starting shutdown
sequence.
Right now every conntrack is created in init_net.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:02 +0000 (11:35 +0200)]
netfilter: netns nf_conntrack: add netns boilerplate
One comment: #ifdefs around #include is necessary to overcome amazing compile
breakages in NOTRACK-in-netns patch (see below).
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:02 +0000 (11:35 +0200)]
netfilter: netns: ip6t_REJECT in netns for real
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:02 +0000 (11:35 +0200)]
netfilter: netns: ip6table_mangle in netns for real
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:01 +0000 (11:35 +0200)]
netfilter: netns: ip6table_raw in netns for real
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Alexey Dobriyan [Wed, 8 Oct 2008 09:35:01 +0000 (11:35 +0200)]
netfilter: netns: remove nf_*_net() wrappers
Now that dev_net() exists, the usefullness of them is even less. Also they're
a big problem in resolving circular header dependencies necessary for
NOTRACK-in-netns patch. See below.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:01 +0000 (11:35 +0200)]
netfilter: implement NFPROTO_UNSPEC as a wildcard for extensions
When a match or target is looked up using xt_find_{match,target},
Xtables will also search the NFPROTO_UNSPEC module list. This allows
for protocol-independent extensions (like xt_time) to be reused from
other components (e.g. arptables, ebtables).
Extensions that take different codepaths depending on match->family
or target->family of course cannot use NFPROTO_UNSPEC within the
registration structure (e.g. xt_pkttype).
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:01 +0000 (11:35 +0200)]
netfilter: x_tables: use NFPROTO_* in extensions
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:00 +0000 (11:35 +0200)]
netfilter: Introduce NFPROTO_* constants
The netfilter subsystem only supports a handful of protocols (much
less than PF_*) and even non-PF protocols like ARP and
pseudo-protocols like PF_BRIDGE. By creating NFPROTO_*, we can earn a
few memory savings on arrays that previously were always PF_MAX-sized
and keep the pseudo-protocols to ourselves.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:00 +0000 (11:35 +0200)]
netfilter: xt_recent: IPv6 support
This updates xt_recent to support the IPv6 address family.
The new /proc/net/xt_recent directory must be used for this.
The old proc interface can also be configured out.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:00 +0000 (11:35 +0200)]
netfilter: rename ipt_recent to xt_recent
Like with other modules (such as ipt_state), ipt_recent.h is changed
to forward definitions to (IOW include) xt_recent.h, and xt_recent.c
is changed to use the new constant names.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Jan Engelhardt [Wed, 8 Oct 2008 09:35:00 +0000 (11:35 +0200)]
netfilter: Use unsigned types for hooknum and pf vars
and (try to) consistently use u_int8_t for the L3 family.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Denis V. Lunev [Tue, 7 Oct 2008 21:50:06 +0000 (14:50 -0700)]
netns: make uplitev6 mib per/namespace
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Tue, 7 Oct 2008 21:49:36 +0000 (14:49 -0700)]
netns: make udpv6 mib per/namespace
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Tue, 7 Oct 2008 21:48:53 +0000 (14:48 -0700)]
netns: add stub functions for per/namespace mibs allocation
The content of init_ipv6_mibs/cleanup_ipv6_mibs will be moved to new
calls one by one next.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Tue, 7 Oct 2008 21:47:55 +0000 (14:47 -0700)]
netns: allow per device ipv6 snmp statistics in non-initial namespace
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Tue, 7 Oct 2008 21:47:37 +0000 (14:47 -0700)]
netns: register global ipv6 mibs statistics in each namespace
Unused net variable will become used very soon.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Tue, 7 Oct 2008 21:47:12 +0000 (14:47 -0700)]
ipv6: separate seq_ops for global & per/device ipv6 statistics
idev has been stored on seq->private. NULL has been stored for global
statistics.
The situation is changed with net namespace. We need to store pointer to
struct net and the only place is seq->private. So, we'll have for
/proc/net/dev_snmp6/* and for /proc/net/snmp6 pointers of two different
types stored in the same field.
This effectively requires to separate seq_ops of these files.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Tue, 7 Oct 2008 21:46:47 +0000 (14:46 -0700)]
ipv6: consolidate ipv6 sock_stat code at the beginning of net/ipv6/proc.c
Simple, comsolidate sockstat6 staff in one place, at the beginning of
the file. Right now sockstat6_seq_open/sockstat6_seq_fops looks like an
intrusion in the middle of snmp6 code.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Tue, 7 Oct 2008 21:46:18 +0000 (14:46 -0700)]
netns: register /proc/net/dev_snmp6/* in each ns
Do the same for /proc/net/snmp6.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Tue, 7 Oct 2008 21:45:55 +0000 (14:45 -0700)]
netns: move /proc/net/dev_snmp6 to struct net
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ilpo Järvinen [Tue, 7 Oct 2008 21:43:31 +0000 (14:43 -0700)]
tcp: cleanup messy initializer
I'm quite sure that if I give this function in its old format
for you to inspect, you start to wonder what is the type of
demanded or if it's a global variable.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ilpo Järvinen [Tue, 7 Oct 2008 21:43:06 +0000 (14:43 -0700)]
tcp: kill pointless urg_mode
It all started from me noticing that this urgent check in
tcp_clean_rtx_queue is unnecessarily inside the loop. Then
I took a longer look to it and found out that the users of
urg_mode can trivially do without, well almost, there was
one gotcha.
Bonus: those funny people who use urg with >= 2^31 write_seq -
snd_una could now rejoice too (that's the only purpose for the
between being there, otherwise a simple compare would have done
the thing). Not that I assume that the rest of the tcp code
happily lives with such mind-boggling numbers :-). Alas, it
turned out to be impossible to set wmem to such numbers anyway,
yes I really tried a big sendfile after setting some wmem but
nothing happened :-). ...Tcp_wmem is int and so is sk_sndbuf...
So I hacked a bit variable to long and found out that it seems
to work... :-)
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Zijlstra [Tue, 7 Oct 2008 21:22:33 +0000 (14:22 -0700)]
net: packet split receive api
Add some packet-split receive hooks.
For one this allows to do NUMA node affine page allocs. Later on these
hooks will be extended to do emergency reserve allocations for
fragments.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Zijlstra [Tue, 7 Oct 2008 21:18:42 +0000 (14:18 -0700)]
net: wrap sk->sk_backlog_rcv()
Wrap calling sk->sk_backlog_rcv() in a function. This will allow extending the
generic sk_backlog_rcv behaviour.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Zijlstra [Tue, 7 Oct 2008 21:15:00 +0000 (14:15 -0700)]
ipv6: initialize ip6_route sysctl vars in ip6_route_net_init()
This makes that ip6_route_net_init() does all of the route init code.
There used to be a race between ip6_route_net_init() and ip6_net_init()
and someone relying on the combined result was left out cold.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Zijlstra [Tue, 7 Oct 2008 21:12:10 +0000 (14:12 -0700)]
ipv6: clean up ip6_route_net_init() error handling
ip6_route_net_init() error handling looked less than solid, fix 'er up.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
KOVACS Krisztian [Tue, 7 Oct 2008 19:41:01 +0000 (12:41 -0700)]
inet: Don't lookup the socket if there's a socket attached to the skb
Use the socket cached in the skb if it's present.
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
KOVACS Krisztian [Tue, 7 Oct 2008 19:38:32 +0000 (12:38 -0700)]
inet: Add udplib_lookup_skb() helpers
To be able to use the cached socket reference in the skb during input
processing we add a new set of lookup functions that receive the skb on
their argument list.
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnaldo Carvalho de Melo [Tue, 7 Oct 2008 18:41:57 +0000 (11:41 -0700)]
inet_hashtables: Add inet_lookup_skb helpers
To be able to use the cached socket reference in the skb during input
processing we add a new set of lookup functions that receive the skb on
their argument list.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 6 Oct 2008 17:43:54 +0000 (10:43 -0700)]
tcp: Respect SO_RCVLOWAT in tcp_poll().
Based upon a report by Vito Caputo.
Signed-off-by: David S. Miller <davem@davemloft.net>
Jarek Poplawski [Mon, 6 Oct 2008 17:41:50 +0000 (10:41 -0700)]
pkt_sched: Simplify dev_requeue_skb and dequeue_skb
qdisc->requeue was planned to universally replace all requeuing code,
but at the top level we never requeue more than one skb, so qdisc->
gso_skb is enough for this. qdisc->requeue would be used on the lower
levels only for one level deep requeuing (like in sch_hfsc) after
finishing all the changes.
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jarek Poplawski [Mon, 6 Oct 2008 16:54:39 +0000 (09:54 -0700)]
pkt_sched: Fix handling of gso skbs on requeuing
Jay Cliburn noticed and diagnosed a bug triggered in
dev_gso_skb_destructor() after last change from qdisc->gso_skb
to qdisc->requeue list. Since gso_segmented skbs can't be queued
to another list this patch brings back qdisc->gso_skb for them.
Reported-by: Jay Cliburn <jcliburn@gmail.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnaud Ebalard [Sun, 5 Oct 2008 20:33:42 +0000 (13:33 -0700)]
xfrm: MIGRATE enhancements (draft-ebalard-mext-pfkey-enhanced-migrate)
Provides implementation of the enhancements of XFRM/PF_KEY MIGRATE mechanism
specified in draft-ebalard-mext-pfkey-enhanced-migrate-00. Defines associated
PF_KEY SADB_X_EXT_KMADDRESS extension and XFRM/netlink XFRMA_KMADDRESS
attribute.
Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rémi Denis-Courmont [Sun, 5 Oct 2008 18:16:36 +0000 (11:16 -0700)]
Phonet: pipe end-point protocol documentation
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rémi Denis-Courmont [Sun, 5 Oct 2008 18:16:16 +0000 (11:16 -0700)]
Phonet: implement GPRS virtual interface over PEP socket
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rémi Denis-Courmont [Sun, 5 Oct 2008 18:15:43 +0000 (11:15 -0700)]
Phonet: receive pipe control requests as out-of-band data
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rémi Denis-Courmont [Sun, 5 Oct 2008 18:15:13 +0000 (11:15 -0700)]
Phonet: Pipe End Point for Phonet Pipes protocol
This protocol provides some connection handling and negotiated
congestion control. Nokia cellular modems use it for bulk transfers.
It provides packet boundaries (hence SOCK_SEQPACKET). Congestion
control is per packet rather per byte, so we do not re-use the
generic socket memory accounting.
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>