aboutsummaryrefslogtreecommitdiff
path: root/net/ipv4/raw.c
AgeCommit message (Collapse)Author
2008-01-28[NETNS][RAW]: Create the /proc/net/raw(6) in each namespace.Pavel Emelyanov
To do so, just register the proper subsystem and create files in ->init callbacks. No other special per-namespace handling for raw sockets is required. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NETNS][RAW]: Eliminate explicit init_net references.Pavel Emelyanov
Happily, in all the rest places (->bind callbacks only), that require the struct net, we have a socket, so get the net from it. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NETNS][RAW]: Make /proc/net/raw(6) show per-namespace socket list.Pavel Emelyanov
Pull the struct net pointer up to the showing functions to filter the sockets depending on their namespaces. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NETNS][RAW]: Make ipv[46] raw sockets lookup namespaces aware.Pavel Emelyanov
This requires just to pass the appropriate struct net pointer into __raw_v[46]_lookup and skip sockets that do not belong to a needed namespace. The proper net is get from skb->dev in all the cases. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NETNS]: Add netns parameter to inet_(dev_)add_type.Eric W. Biederman
The patch extends the inet_addr_type and inet_dev_addr_type with the network namespace pointer. That allows to access the different tables relatively to the network namespace. The modification of the signature function is reported in all the callers of the inet_addr_type using the pointer to the well known init_net. Acked-by: Benjamin Thery <benjamin.thery@bull.net> Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NET]: prot_inuse cleanups and optimizationsEric Dumazet
1) Cleanups (all functions are prefixed by sock_prot_inuse) sock_prot_inc_use(prot) -> sock_prot_inuse_add(prot,-1) sock_prot_dec_use(prot) -> sock_prot_inuse_add(prot,-1) sock_prot_inuse() -> sock_prot_inuse_get() New functions : sock_prot_inuse_init() and sock_prot_inuse_free() to abstract pcounter use. 2) if CONFIG_PROC_FS=n, we can zap 'inuse' member from "struct proto", since nobody wants to read the inuse value. This saves 1372 bytes on i386/SMP and some cpu cycles. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPV4] net/ipv4: Use ipv4_is_<type>Joe Perches
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[RAW]: Consolidate proc interface.Pavel Emelyanov
Both ipv6/raw.c and ipv4/raw.c use the seq files to walk through the raw sockets hash and show them. The "walking" code is rather huge, but is identical in both cases. The difference is the hash table to walk over and the protocol family to check (this was not in the first virsion of the patch, which was noticed by YOSHIFUJI) Make the ->open store the needed hash table and the family on the allocated raw_iter_state and make the start/next/stop callbacks work with it. This removes most of the code. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[RAW]: Consolidate proto->unhash callbackPavel Emelyanov
Same as the ->hash one, this is easily consolidated. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[RAW]: Consolidate proto->hash callbackPavel Emelyanov
Having the raw_hashinfo it's easy to consolidate the raw[46]_hash functions. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[RAW]: Introduce raw_hashinfo structurePavel Emelyanov
The ipv4/raw.c and ipv6/raw.c contain many common code (most of which is proc interface) which can be consolidated. Most of the places to consolidate deal with the raw sockets hashtable, so introduce a struct raw_hashinfo which describes the raw sockets hash. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPv4] RAW: Compact the API for the kernelPavel Emelyanov
The raw sockets functions are explicitly used from inside the kernel in two places: 1. in ip_local_deliver_finish to intercept skb-s 2. in icmp_error For this purposes many functions and even data structures, that are naturally internal for raw protocol, are exported. Compact the API to two functions and hide all the other (including hash table and rwlock) inside the net/ipv4/raw.c Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NETFILTER]: Introduce NF_INET_ hook valuesPatrick McHardy
The IPv4 and IPv6 hook values are identical, yet some code tries to figure out the "correct" value by looking at the address family. Introduce NF_INET_* values for both IPv4 and IPv6. The old values are kept in a #ifndef __KERNEL__ section for userspace compatibility. Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPV4]: Add raw drops counter.Wang Chen
Add raw drops counter for IPv4 in /proc/net/raw . Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-08[IPV4] raw: Strengthen check on validity of iph->ihlHerbert Xu
We currently check that iph->ihl is bounded by the real length and that the real length is greater than the minimum IP header length. However, we did not check the caes where iph->ihl is less than the minimum IP header length. This breaks because some ip_fast_csum implementations assume that which is quite reasonable. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07[IPV4]: Use the {DEFINE|REF}_PROTO_INUSE infrastructureEric Dumazet
Trivial patch to make "tcp,udp,udplite,raw" protocols uses the fast "inuse sockets" infrastructure Each protocol use then a static percpu var, instead of a dynamic one. This saves some ram and some cpu cycles Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[NET]: Make core networking code use seq_open_privatePavel Emelyanov
This concerns the ipv4 and ipv6 code mostly, but also the netlink and unix sockets. The netlink code is an example of how to use the __seq_open_private() call - it saves the net namespace on this private. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[IPV4]: Add ICMPMsgStats MIB (RFC 4293)David L Stevens
Background: RFC 4293 deprecates existing individual, named ICMP type counters to be replaced with the ICMPMsgStatsTable. This table includes entries for both IPv4 and IPv6, and requires counting of all ICMP types, whether or not the machine implements the type. These patches "remove" (but not really) the existing counters, and replace them with the ICMPMsgStats tables for v4 and v6. It includes the named counters in the /proc places they were, but gets the values for them from the new tables. It also counts packets generated from raw socket output (e.g., OutEchoes, MLD queries, RA's from radvd, etc). Changes: 1) create icmpmsg_statistics mib 2) create icmpv6msg_statistics mib 3) modify existing counters to use these 4) modify /proc/net/snmp to add "IcmpMsg" with all ICMP types listed by number for easy SNMP parsing 5) modify /proc/net/snmp printing for "Icmp" to get the named data from new counters. Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[NET]: Make /proc/net per network namespaceEric W. Biederman
This patch makes /proc/net per network namespace. It modifies the global variables proc_net and proc_net_stat to be per network namespace. The proc_net file helpers are modified to take a network namespace argument, and all of their callers are fixed to pass &init_net for that argument. This ensures that all of the /proc/net files are only visible and usable in the initial network namespace until the code behind them has been updated to be handle multiple network namespaces. Making /proc/net per namespace is necessary as at least some files in /proc/net depend upon the set of network devices which is per network namespace, and even more files in /proc/net have contents that are relevant to a single network namespace. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-08-02[IPV4] raw.c: kmalloc + memset conversion to kzallocMariusz Kozlowski
Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[SK_BUFF]: unions of just one member don't get anything done, kill themArnaldo Carvalho de Melo
Renaming skb->h to skb->transport_header, skb->nh to skb->network_header and skb->mac to skb->mac_header, to match the names of the associated helpers (skb[_[re]set]_{transport,network,mac}_header). Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[SK_BUFF]: Introduce icmp_hdr(), remove skb->h.icmphArnaldo Carvalho de Melo
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[SK_BUFF]: Introduce ip_hdr(), remove skb->nh.iphArnaldo Carvalho de Melo
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[NET]: make seq_operations constStephen Hemminger
The seq_file operations stuff can be marked constant to get it out of dirty cache. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[SK_BUFF]: Introduce skb_network_header()Arnaldo Carvalho de Melo
For the places where we need a pointer to the network header, it is still legal to touch skb->nh.raw directly if just adding to, subtracting from or setting it to another layer header. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[SK_BUFF]: Use skb_reset_network_header where the skb_pull return was being usedArnaldo Carvalho de Melo
But only in the cases where its a newly allocated skb, i.e. one where skb->tail is equal to skb->data, or just after skb_reserve, where this requirement is maintained. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-12[PATCH] mark struct file_operations const 7Arjan van de Ven
Many struct file_operations in the kernel can be "const". Marking them const moves these to the .rodata section, which avoids false sharing with potential dirty data. In addition it'll catch accidental writes at compile time to these shared resources. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-10[NET] IPV4: Fix whitespace errors.YOSHIFUJI Hideaki
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08[IPV4/IPV6]: Always wait for IPSEC SA resolution in socket contexts.David S. Miller
Do this even for non-blocking sockets. This avoids the silly -EAGAIN that applications can see now, even for non-blocking sockets in some cases (f.e. connect()). With help from Venkat Tekkirala. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02[IPV6]: Assorted trivial endianness annotations.Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-30[NET]: fix uaccess handlingHeiko Carstens
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28[IPV4]: struct ipcm_cookie annotationAl Viro
->addr is net-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28[IPV4]: struct ip_options annotationsAl Viro
->faddr is net-endian; annotated as such, variables inferred to be net-endian annotated. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22[NET]: Remove unnecessary config.h includes from net/Dave Jones
config.h is automatically included by kbuild these days. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22[MLSXFRM]: Add flow labelingVenkat Yekkirala
This labels the flows that could utilize IPSec xfrms at the points the flows are defined so that IPSec policy and SAs at the right label can be used. The following protos are currently not handled, but they should continue to be able to use single-labeled IPSec like they currently do. ipmr ip_gre ipip igmp sit sctp ip6_tunnel (IPv6 over IPv6 tunnel device) decnet Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-25[IPV4/IPV6]: Setting 0 for unused port field in RAW IP recvmsg().Tetsuo Handa
From: Tetsuo Handa from-linux-kernel@i-love.sakura.ne.jp The recvmsg() for raw socket seems to return random u16 value from the kernel stack memory since port field is not initialized. But I'm not sure this patch is correct. Does raw socket return any information stored in port field? [ BSD defines RAW IP recvmsg to return a sin_port value of zero. This is described in Steven's TCP/IP Illustrated Volume 2 on page 1055, which is discussing the BSD rip_input() implementation. ] Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17[IPV4]: Right prototype of __raw_v4_lookup()Alexey Dobriyan
All users pass 32-bit values as addresses and internally they're compared with 32-bit entities. So, change "laddr" and "raddr" types to __be32. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20[NET]: Identation & other cleanups related to compat_[gs]etsockopt csetArnaldo Carvalho de Melo
No code changes, just tidying up, in some cases moving EXPORT_SYMBOLs to just after the function exported, etc. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20[NET]: {get|set}sockopt compatibility layerDmitry Mishin
This patch extends {get|set}sockopt compatibility layer in order to move protocol specific parts to their place and avoid huge universal net/compat.c file in the future. Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-18[PATCH] EDAC: atomic scrub operationsAlan Cox
EDAC requires a way to scrub memory if an ECC error is found and the chipset does not do the work automatically. That means rewriting memory locations atomically with respect to all CPUs _and_ bus masters. That means we can't use atomic_add(foo, 0) as it gets optimised for non-SMP This adds a function to include/asm-foo/atomic.h for the platforms currently supported which implements a scrub of a mapped block. It also adjusts a few other files include order where atomic.h is included before types.h as this now causes an error as atomic_scrub uses u32. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-07[NETFILTER]: Keep conntrack reference until IPsec policy checks are donePatrick McHardy
Keep the conntrack reference until policy checks have been performed for IPsec NAT support. The reference needs to be dropped before a packet is queued to avoid having the conntrack module unloadable. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-19[PATCH] raw_sendmsg DoS on 2.6Mark J Cox
Fix unchecked __get_user that could be tricked into generating a memory read on an arbitrary address. The result of the read is not returned directly but you may be able to divine some information about it, or use the read to cause a crash on some architectures by reading hardware state. CAN-2004-2492. Fix from Al Viro, ack from Dave Miller. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-29[TCP]: Move the tcp sock states to net/tcp_states.hArnaldo Carvalho de Melo
Lots of places just needs the states, not even linux/tcp.h, where this enum was, needs it. This speeds up development of the refactorings as less sources are rebuilt when things get moved from net/tcp.h. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[IPV4/6]: Check if packet was actually delivered to a raw socket to decide ↵Patrick McHardy
whether to send an ICMP unreachable Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[IPV4]: [4/4] signed vs unsigned cleanup in net/ipv4/raw.cJesper Juhl
This patch changes the type of the third parameter 'length' of the raw_send_hdrinc() function from 'int' to 'size_t'. This makes sense since this function is only ever called from one location, and the value passed as the third parameter in that location is itself of type size_t, so this makes the recieving functions parameter type match. Also, inside raw_send_hdrinc() the 'length' variable is used in comparisons with unsigned values and passed as parameter to functions expecting unsigned values (it's used in a single comparison with a signed value, but that one can never actually be negative so the patch also casts that one to size_t to stop gcc worrying, and it is passed in a single instance to memcpy_fromiovecend() which expects a signed int, but as far as I can see that's not a problem since the value of 'length' shouldn't ever exceed the value of a signed int). Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[IPV4]: [3/4] signed vs unsigned cleanup in net/ipv4/raw.cJesper Juhl
This patch changes the type of the local variable 'i' in raw_probe_proto_opt() from 'int' to 'unsigned int'. The only use of 'i' in this function is as a counter in a for() loop and subsequent index into the msg->msg_iov[] array. Since 'i' is compared in a loop to the unsigned variable msg->msg_iovlen gcc -W generates this warning : net/ipv4/raw.c:340: warning: comparison between signed and unsigned Changing 'i' to unsigned silences this warning and is safe since the array index can never be negative anyway, so unsigned int is the logical type to use for 'i' and also enables a larger msg_iov[] array (but I don't know if that will ever matter). Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[IPV4]: [2/4] signed vs unsigned cleanup in net/ipv4/raw.cJesper Juhl
This patch gets rid of the following gcc -W warning in net/ipv4/raw.c : net/ipv4/raw.c:387: warning: comparison of unsigned expression < 0 is always false Since 'len' is of type size_t it is unsigned and can thus never be <0, and since this is obvious from the function declaration just a few lines above I think it's ok to remove the pointless check for len<0. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[IPV4]: [1/4] signed vs unsigned cleanup in net/ipv4/raw.cJesper Juhl
This patch silences these two gcc -W warnings in net/ipv4/raw.c : net/ipv4/raw.c:517: warning: signed and unsigned type in conditional expression net/ipv4/raw.c:613: warning: signed and unsigned type in conditional expression It doesn't change the behaviour of the code, simply writes the conditional expression with plain 'if()' syntax instead of '? :' , but since this breaks it into sepperate statements gcc no longer complains about having both a signed and unsigned value in the same conditional expression. Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18[IPV4/IPV6]: Replace spin_lock_irq with spin_lock_bhHerbert Xu
In light of my recent patch to net/ipv4/udp.c that replaced the spin_lock_irq calls on the receive queue lock with spin_lock_bh, here is a similar patch for all other occurences of spin_lock_irq on receive/error queue locks in IPv4 and IPv6. In these stacks, we know that they can only be entered from user or softirq context. Therefore it's safe to disable BH only. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-05[PATCH] update Ross Biro bouncing email addressJesper Juhl
Ross moved. Remove the bad email address so people will find the correct one in ./CREDITS. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>