aboutsummaryrefslogtreecommitdiff
path: root/include/net
AgeCommit message (Collapse)Author
2008-01-31[NETNS]: Lookup in FIB semantic hashes taking into account the namespace.Denis V. Lunev
The namespace is not available in the fib_sync_down_addr, add it as a parameter. Looking up a device by the pointer to it is OK. Looking up using a result from fib_trie/fib_hash table lookup is also safe. No need to fix that at all. So, just fix lookup by address and insertion to the hash table path. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[NETNS]: Add a namespace mark to fib_info.Denis V. Lunev
This is required to make fib_info lookups namespace aware. In the other case initial namespace devices are marked as dead in the local routing table during other namespace stop. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[IPV4]: fib_sync_down rework.Denis V. Lunev
fib_sync_down can be called with an address and with a device. In reality it is called either with address OR with a device. The codepath inside is completely different, so lets separate it into two calls for these two cases. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[NET_SCHED]: Constify struct tcf_ext_mapPatrick McHardy
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[IPV4] route cache: Introduce rt_genid for smooth cache invalidationEric Dumazet
Current ip route cache implementation is not suited to large caches. We can consume a lot of CPU when cache must be invalidated, since we currently need to evict all cache entries, and this eviction is sometimes asynchronous. min_delay & max_delay can somewhat control this asynchronism behavior, but whole thing is a kludge, regularly triggering infamous soft lockup messages. When entries are still in use, this also consumes a lot of ram, filling dst_garbage.list. A better scheme is to use a generation identifier on each entry, so that cache invalidation can be performed by changing the table identifier, without having to scan all entries. No more delayed flushing, no more stalling when secret_interval expires. Invalidated entries will then be freed at GC time (controled by ip_rt_gc_timeout or stress), or when an invalidated entry is found in a chain when an insert is done. Thus we keep a normal equilibrium. This patch : - renames rt_hash_rnd to rt_genid (and makes it an atomic_t) - Adds a new rt_genid field to 'struct rtable' (filling a hole on 64bit) - Checks entry->rt_genid at appropriate places :
2008-01-31[NETNS]: Tcp-v6 sockets per-net lookup.Pavel Emelyanov
Add a net argument to inet6_lookup and propagate it further. Actually, this is tcp-v6 implementation of what was done for tcp-v4 sockets in a previous patch. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[NETNS]: Tcp-v4 sockets per-net lookup.Pavel Emelyanov
Add a net argument to inet_lookup and propagate it further into lookup calls. Plus tune the __inet_check_established. The dccp and inet_diag, which use that lookup functions pass the init_net into them. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[NETNS]: Make bind buckets live in net namespaces.Pavel Emelyanov
This tags the inet_bind_bucket struct with net pointer, initializes it during creation and makes a filtering during lookup. A better hashfn, that takes the net into account is to be done in the future, but currently all bind buckets with similar port will be in one hash chain. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[INET]: Consolidate inet(6)_hash_connect.Pavel Emelyanov
These two functions are the same except for what they call to "check_established" and "hash" for a socket. This saves half-a-kilo for ipv4 and ipv6. add/remove: 1/0 grow/shrink: 1/4 up/down: 582/-1128 (-546) function old new delta __inet_hash_connect - 577 +577 arp_ignore 108 113 +5 static.hint 8 4 -4 rt_worker_func 376 372 -4 inet6_hash_connect 584 25 -559 inet_hash_connect 586 25 -561 Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[NETFILTER]: nf_conntrack: annotate l3protos with constJan Engelhardt
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[NETFILTER]: nf_{conntrack,nat}_proto_tcp: constify and annotate TCP modulesJan Engelhardt
Constify a few data tables use const qualifiers on variables where possible in the nf_*_proto_tcp sources. Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[NETFILTER]: nf_conntrack: naming unificationPatrick McHardy
Rename all "conntrack" variables to "ct" for more consistency and avoiding some overly long lines. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[NETFILTER]: nf_conntrack: reorder struct nf_conntrack_l4protoPatrick McHardy
Reorder struct nf_conntrack_l4proto so all members used during packet processing are in the same cacheline. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[NETFILTER]: nf_conntrack: avoid duplicate protocol comparison in ↵Patrick McHardy
nf_ct_tuple_equal() nf_ct_tuple_src_equal() and nf_ct_tuple_dst_equal() both compare the protocol numbers. Unfortunately gcc doesn't optimize out the second comparison, so remove it and prefix both functions with __ to indicate that they should not be used directly. Saves another 16 byte of text in __nf_conntrack_find() on x86_64: nf_conntrack_tuple_taken | -20 # 320 -> 300, size inlines: 181 -> 161 __nf_conntrack_find | -16 # 267 -> 251, size inlines: 127 -> 115 __nf_conntrack_confirm | -40 # 875 -> 835, size inlines: 570 -> 537 3 functions changed, 76 bytes removed Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[NETFILTER]: nf_conntrack: optimize __nf_conntrack_find()Patrick McHardy
Ignoring specific entries in __nf_conntrack_find() is only needed by NAT for nf_conntrack_tuple_taken(). Remove it from __nf_conntrack_find() and make nf_conntrack_tuple_taken() search the hash itself. Saves 54 bytes of text in the hotpath on x86_64: __nf_conntrack_find | -54 # 321 -> 267, # inlines: 3 -> 2, size inlines: 181 -> 127 nf_conntrack_tuple_taken | +305 # 15 -> 320, lexblocks: 0 -> 3, # inlines: 0 -> 3, size inlines: 0 -> 181 nf_conntrack_find_get | -2 # 90 -> 88 3 functions changed, 305 bytes added, 56 bytes removed, diff: +249 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[NETFILTER]: nf_conntrack: switch rwlock to spinlockPatrick McHardy
With the RCU conversion only write_lock usages of nf_conntrack_lock are left (except one read_lock that should actually use write_lock in the H.323 helper). Switch to a spinlock. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[NETFILTER]: nf_conntrack: use RCU for conntrack hashPatrick McHardy
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[NETFILTER]: nf_conntrack_expect: use RCU for expectation hashPatrick McHardy
Use RCU for expectation hash. This doesn't buy much for conntrack runtime performance, but allows to reduce the use of nf_conntrack_lock for /proc and nf_netlink_conntrack. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[NETFILTER]: nf_conntrack: use RCU for conntrack helpersPatrick McHardy
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[NETFILTER]: nf_conntrack: sparse warningsStephen Hemminger
The hashtable size is really unsigned so sparse complains when you pass a signed integer. Change all uses to make it consistent. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[NETFILTER]: arp_tables: per-netns arp_tables FILTERAlexey Dobriyan
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[NETFILTER]: ip6_tables: per-netns IPv6 FILTER, MANGLE, RAWAlexey Dobriyan
Now it's possible to list and manipulate per-netns ip6tables rules. Filtering decisions are based on init_net's table so far. P.S.: remove init_net check in inet6_create() to see the effect Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[NETFILTER]: ip_tables: per-netns FILTER, MANGLE, RAWAlexey Dobriyan
Now, iptables show and configure different set of rules in different netnss'. Filtering decisions are still made by consulting only init_net's set. Changes are identical except naming so no splitting. P.S.: one need to remove init_net checks in nf_sockopt.c and inet_create() to see the effect. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[NETFILTER]: x_tables: per-netns xt_tablesAlexey Dobriyan
In fact all we want is per-netns set of rules, however doing that will unnecessary complicate routines such as ipt_hook()/ipt_do_table, so make full xt_table array per-netns. Every user stubbed with init_net for a while. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[NETFILTER]: ebtables: remove casts, use constsJan Engelhardt
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[NETFILTER]: nf_log: add netfilter gcc printf format checkingHelge Deller
Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[RAW]: Wrong content of the /proc/net/raw6.Denis V. Lunev
The address of IPv6 raw sockets was shown in the wrong format, from IPv4 ones. The problem has been introduced by the commit 42a73808ed4f30b739eb52bcbb33a02fe62ceef5 ("[RAW]: Consolidate proc interface.") Thanks to Adrian Bunk who originally noticed the problem. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[RAW]: Family check in the /proc/net/raw[6] is extra.Denis V. Lunev
Different hashtables are used for IPv6 and IPv4 raw sockets, so no need to check the socket family in the iterator over hashtables. Clean this out. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[XFRM]: constify 'struct xfrm_type'Eric Dumazet
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[NET]: Introducing socket mark socket option.Laszlo Attila Toth
A userspace program may wish to set the mark for each packets its send without using the netfilter MARK target. Changing the mark can be used for mark based routing without netfilter or for packet filtering. It requires CAP_NET_ADMIN capability. Signed-off-by: Laszlo Attila Toth <panther@balabit.hu> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[IPSEC]: Add support for combined mode algorithmsHerbert Xu
This patch adds support for combined mode algorithms with GCM being the first algorithm supported. Combined mode algorithms can be added through the xfrm_user interface using the new algorithm payload type XFRMA_ALG_AEAD. Each algorithms is identified by its name and the ICV length. For the purposes of matching algorithms in xfrm_tmpl structures, combined mode algorithms occupy the same name space as encryption algorithms. This is in line with how they are negotiated using IKE. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31[IPSEC]: Use crypto_aead and authenc in ESPHerbert Xu
This patch converts ESP to use the crypto_aead interface and in particular the authenc algorithm. This lays the foundations for future support of combined mode algorithms. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-30NetLabel: Introduce static network labels for unlabeled connectionsPaul Moore
Most trusted OSs, with the exception of Linux, have the ability to specify static security labels for unlabeled networks. This patch adds this ability to the NetLabel packet labeling framework. If the NetLabel subsystem is called to determine the security attributes of an incoming packet it first checks to see if any recognized NetLabel packet labeling protocols are in-use on the packet. If none can be found then the unlabled connection table is queried and based on the packets incoming interface and address it is matched with a security label as configured by the administrator using the netlabel_tools package. The matching security label is returned to the caller just as if the packet was explicitly labeled using a labeling protocol. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>
2008-01-30NetLabel: Add IP address family information to the netlbl_skbuff_getattr() ↵Paul Moore
function In order to do any sort of IP header inspection of incoming packets we need to know which address family, AF_INET/AF_INET6/etc., it belongs to and since the sk_buff structure does not store this information we need to pass along the address family separate from the packet itself. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>
2008-01-30NetLabel: Add secid token support to the NetLabel secattr structPaul Moore
This patch adds support to the NetLabel LSM secattr struct for a secid token and a type field, paving the way for full LSM/SELinux context support and "static" or "fallback" labels. In addition, this patch adds a fair amount of documentation to the core NetLabel structures used as part of the NetLabel kernel API. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>
2008-01-28[NET_SCHED]: act_api: use PTR_ERR in tcf_action_init/tcf_action_getPatrick McHardy
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NETNS]: Routing cache virtualization.Denis V. Lunev
Basically, this piece looks relatively easy. Namespace is already available on the dst entry via device and the device is safe to dereferrence. Compare it with one of a searcher and skip entry if appropriate. The only exception is ip_rt_frag_needed. So, add namespace parameter to it. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NETNS]: Correct namespace for connect-time routing.Denis V. Lunev
ip_route_connect and ip_route_newports are a part of routing API presented to the socket layer. The namespace is available inside them through a socket. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NET_SCHED]: Convert actions from rtnetlink to new netlink APIPatrick McHardy
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NET_SCHED]: Convert classifiers from rtnetlink to new netlink APIPatrick McHardy
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NET_SCHED]: Convert packet schedulers from rtnetlink to new netlink APIPatrick McHardy
Convert packet schedulers to use the netlink API. Unfortunately a gradual conversion is not possible without breaking compilation in the middle or adding lots of casts, so this patch converts them all in one step. The patch has been mostly generated automatically with some minor edits to at least allow seperate conversion of classifiers and actions. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NETLINK]: Add nla_append()Patrick McHardy
Used to append data to a message without a header or padding. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NETNS]: Add namespace parameter to ip_route_output_key.Denis V. Lunev
Needed to propagate it down to the ip_route_output_flow. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NETNS]: Add namespace parameter to ip_route_output_flow.Denis V. Lunev
Needed to propagate it down to the __ip_route_output_key. Signed_off_by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NETNS]: Add namespace parameter to __ip_route_output_key.Denis V. Lunev
This is only required to propagate it down to the ip_route_output_slow. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NETNS]: Add netns parameter to fib_select_default.Denis V. Lunev
Currently fib_select_default calls fib_get_table() with the init_net. Prepare it to provide a correct namespace to lookup default route. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPV4]: Consolidate fib_select_default.Denis V. Lunev
The difference in the implementation of the fib_select_default when CONFIG_IP_MULTIPLE_TABLES is (not) defined looks negligible. Consolidate it and place into fib_frontend.c. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPV4]: Declarations cleanup in ip_fib.h.Denis V. Lunev
Two small issues fixed: - fib_select_multipath is exported from fib_semantics.c rather than from fib_frontend.c. So, move the declaration below appropriate comment. - struct rt_entry declaration is not used. Drop it. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DST]: shrinks sizeof(struct rtable) by 64 bytes on x86_64Eric Dumazet
On x86_64, sizeof(struct rtable) is 0x148, which is rounded up to 0x180 bytes by SLAB allocator. We can reduce this to exactly 0x140 bytes, without alignment overhead, and store 12 struct rtable per PAGE instead of 10. rate_tokens is currently defined as an "unsigned long", while its content should not exceed 6*HZ. It can safely be converted to an unsigned int. Moving tclassid right after rate_tokens to fill the 4 bytes hole permits to save 8 bytes on 'struct dst_entry', which finally permits to save 8 bytes on 'struct rtable' Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NETNS][FRAGS]: Make the pernet subsystem for fragments.Pavel Emelyanov
On namespace start we mainly prepare the ctl variables. When the namespace is stopped we have to kill all the fragments that point to this namespace. The inet_frags_exit_net() handles it. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>