aboutsummaryrefslogtreecommitdiff
path: root/net/ipv4/icmp.c
AgeCommit message (Collapse)Author
2008-04-03[ICMP]: Ensure that ICMP relookup maintains status quoHerbert Xu
The ICMP relookup path is only meant to modify behaviour when appropriate IPsec policies are in place and marked as requiring relookups. It is certainly not meant to modify behaviour when IPsec policies don't exist at all. However, due to an oversight on the error paths existing behaviour may in fact change should one of the relookup steps fail. This patch corrects this by redirecting all errors on relookup failures to the previous code path. That is, if the initial xfrm_lookup let the packet pass, we will stand by that decision should the relookup fail due to an error. This should be safe from a security point-of-view because compliant systems must install a default deny policy so the packet would'nt have passed in that case. Many thanks to Julian Anastasov for pointing out this error. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26[ICMP]: Dst entry leak in icmp_send host re-lookup code (v2).Pavel Emelyanov
Commit 8b7817f3a959ed99d7443afc12f78a7e1fcc2063 ([IPSEC]: Add ICMP host relookup support) introduced some dst leaks on error paths: the rt pointer can be forgotten to be put. Fix it bu going to a proper label. Found after net namespace's lo refused to unregister :) Many thanks to Den for valuable help during debugging. Herbert pointed out, that xfrm_lookup() will put the rtable in case of error itself, so the first goto fix is redundant. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05[ICMP]: Restore pskb_pull calls in receive functionHerbert Xu
Somewhere along the development of my ICMP relookup patch the header length check went AWOL on the non-IPsec path. This patch restores the check. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NETNS]: Add namespace for ICMP replying code.Denis V. Lunev
All needed API is done, the namespace is available when required from the device on the DST entry from the incoming packet. So, just replace init_net with proper namespace. Other protocols will follow. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NETNS]: Routing cache virtualization.Denis V. Lunev
Basically, this piece looks relatively easy. Namespace is already available on the dst entry via device and the device is safe to dereferrence. Compare it with one of a searcher and skip entry if appropriate. The only exception is ip_rt_frag_needed. So, add namespace parameter to it. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NETNS]: Add namespace parameter to ip_route_output_key.Denis V. Lunev
Needed to propagate it down to the ip_route_output_flow. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NETNS]: Add namespace parameter to __ip_route_output_key.Denis V. Lunev
This is only required to propagate it down to the ip_route_output_slow. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DST]: shrinks sizeof(struct rtable) by 64 bytes on x86_64Eric Dumazet
On x86_64, sizeof(struct rtable) is 0x148, which is rounded up to 0x180 bytes by SLAB allocator. We can reduce this to exactly 0x140 bytes, without alignment overhead, and store 12 struct rtable per PAGE instead of 10. rate_tokens is currently defined as an "unsigned long", while its content should not exceed 6*HZ. It can safely be converted to an unsigned int. Moving tclassid right after rate_tokens to fill the 4 bytes hole permits to save 8 bytes on 'struct dst_entry', which finally permits to save 8 bytes on 'struct rtable' Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NETNS]: Add netns parameter to inet_(dev_)add_type.Eric W. Biederman
The patch extends the inet_addr_type and inet_dev_addr_type with the network namespace pointer. That allows to access the different tables relatively to the network namespace. The modification of the signature function is reported in all the callers of the inet_addr_type using the pointer to the well known init_net. Acked-by: Benjamin Thery <benjamin.thery@bull.net> Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[ICMP]: Avoid sparse warnings in net/ipv4/icmp.cEric Dumazet
CHECK net/ipv4/icmp.c net/ipv4/icmp.c:249:13: warning: context imbalance in 'icmp_xmit_unlock' - unexpected unlock net/ipv4/icmp.c:376:13: warning: context imbalance in 'icmp_reply' - different lock contexts for basic block net/ipv4/icmp.c:430:6: warning: context imbalance in 'icmp_send' - different lock contexts for basic block Solution is to declare both icmp_xmit_lock() and icmp_xmit_unlock() as inline Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPSEC]: Do not let packets pass when ICMP flag is offHerbert Xu
This fixes a logical error in ICMP policy checks which lets packets through if the state ICMP flag is off. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPSEC]: Fix reversed ICMP6 policy checkHerbert Xu
The policy check I added for ICMP on IPv6 is reversed. This patch fixes that. It also adds an skb->sp check so that unprotected packets that fail the policy check do not crash the machine. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPSEC]: Add ICMP host relookup supportHerbert Xu
RFC 4301 requires us to relookup ICMP traffic that does not match any policies using the reverse of its payload. This patch implements this for ICMP traffic that originates from or terminates on localhost. This is activated on outbound with the new policy flag XFRM_POLICY_ICMP, and on inbound by the new state flag XFRM_STATE_ICMP. On inbound the policy check is now performed by the ICMP protocol so that it can repeat the policy check where necessary. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPv4] RAW: Compact the API for the kernelPavel Emelyanov
The raw sockets functions are explicitly used from inside the kernel in two places: 1. in ip_local_deliver_finish to intercept skb-s 2. in icmp_error For this purposes many functions and even data structures, that are naturally internal for raw protocol, are exported. Compact the API to two functions and hide all the other (including hash table and rwlock) inside the net/ipv4/raw.c Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-21[ICMP]: ICMP_MIB_OUTMSGS increment duplicatedWang Chen
Commit "96793b482540f3a26e2188eaf75cb56b7829d3e3" (Add ICMPMsgStats MIB (RFC 4293)) made a mistake. In that patch, David L added a icmp_out_count() in ip_push_pending_frames(), remove icmp_out_count() from icmp_reply(). But he forgot to remove icmp_out_count() from icmp_send() too. Since icmp_send and icmp_reply will call icmp_push_reply, which will call ip_push_pending_frames, a duplicated increment happened in icmp_send. This patch remove the icmp_out_count from icmp_send too. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26[INET]: Unexport icmpmsg_statisticsAdrian Bunk
This patch removes the unused EXPORT_SYMBOL(icmpmsg_statistics). Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[IPV4]: Add ICMPMsgStats MIB (RFC 4293)David L Stevens
Background: RFC 4293 deprecates existing individual, named ICMP type counters to be replaced with the ICMPMsgStatsTable. This table includes entries for both IPv4 and IPv6, and requires counting of all ICMP types, whether or not the machine implements the type. These patches "remove" (but not really) the existing counters, and replace them with the ICMPMsgStats tables for v4 and v6. It includes the named counters in the /proc places they were, but gets the values for them from the new tables. It also counts packets generated from raw socket output (e.g., OutEchoes, MLD queries, RA's from radvd, etc). Changes: 1) create icmpmsg_statistics mib 2) create icmpv6msg_statistics mib 3) modify existing counters to use these 4) modify /proc/net/snmp to add "IcmpMsg" with all ICMP types listed by number for easy SNMP parsing 5) modify /proc/net/snmp printing for "Icmp" to get the named data from new counters. Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[NET]: Make the device list and device lookups per namespace.Eric W. Biederman
This patch makes most of the generic device layer network namespace safe. This patch makes dev_base_head a network namespace variable, and then it picks up a few associated variables. The functions: dev_getbyhwaddr dev_getfirsthwbytype dev_get_by_flags dev_get_by_name __dev_get_by_name dev_get_by_index __dev_get_by_index dev_ioctl dev_ethtool dev_load wireless_process_ioctl were modified to take a network namespace argument, and deal with it. vlan_ioctl_set and brioctl_set were modified so their hooks will receive a network namespace argument. So basically anthing in the core of the network stack that was affected to by the change of dev_base was modified to handle multiple network namespaces. The rest of the network stack was simply modified to explicitly use &init_net the initial network namespace. This can be fixed when those components of the network stack are modified to handle multiple network namespaces. For now the ifindex generator is left global. Fundametally ifindex numbers are per namespace, or else we will have corner case problems with migration when we get that far. At the same time there are assumptions in the network stack that the ifindex of a network device won't change. Making the ifindex number global seems a good compromise until the network stack can cope with ifindex changes when you change namespaces, and the like. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-03[ICMP]: Fix icmp_errors_use_inbound_ifaddr sysctlPatrick McHardy
Currently when icmp_errors_use_inbound_ifaddr is set and an ICMP error is sent after the packet passed through ip_output(), an address from the outgoing interface is chosen as ICMP source address since skb->dev doesn't point to the incoming interface anymore. Fix this by doing an interface lookup on rt->dst.iif and using that device. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-19[IPV4]: icmp: fix crash with sysctl_icmp_errors_use_inbound_ifaddrPatrick McHardy
When icmp_send is called on the local output path before the packet hits ip_output, skb->dev is not set, causing a crash when sysctl_icmp_errors_use_inbound_ifaddr is set. This can happen with the netfilter REJECT target or IPsec tunnels. Let routing decide the ICMP source address in that case, since the packet is locally generated there is no inbound interface and the sysctl should not apply. The option actually seems to be unfixable broken, on the path after ip_output() skb->dev points to the outgoing device and we don't know the incoming device anymore, so its going to do the absolute wrong thing and pick the address of the outgoing interface. Add a comment about this. Reported by Curtis Doty <Curtis@GreenKey.net>. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[SK_BUFF]: Convert skb->tail to sk_buff_data_tArnaldo Carvalho de Melo
So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes on 64bit architectures, allowing us to combine the 4 bytes hole left by the layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4 64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN... :-) Many calculations that previously required that skb->{transport,network, mac}_header be first converted to a pointer now can be done directly, being meaningful as offsets or pointers. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[SK_BUFF]: Introduce icmp_hdr(), remove skb->h.icmphArnaldo Carvalho de Melo
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[SK_BUFF]: Introduce ip_hdr(), remove skb->nh.iphArnaldo Carvalho de Melo
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[SK_BUFF]: Introduce skb_network_header()Arnaldo Carvalho de Melo
For the places where we need a pointer to the network header, it is still legal to touch skb->nh.raw directly if just adding to, subtracting from or setting it to another layer header. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-10[NET] IPV4: Fix whitespace errors.YOSHIFUJI Hideaki
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02[NET]: Annotate callers of the reset of checksum.h stuff.Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02[NET]: Annotate callers of csum_fold() in net/*Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28[IPV4] net/ipv4/icmp.c: trivial annotationsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28[IPV4]: struct ip_options annotationsAl Viro
->faddr is net-endian; annotated as such, variables inferred to be net-endian annotated. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28[IPV4]: icmp_send() annotationAl Viro
The last argument is network-endian (it will go straight into the packet). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28[IPV4]: annotate struct in_ifaddrAl Viro
ifa_local, ifa_address, ifa_mask, ifa_broadcast and ifa_anycast are net-endian. Annotated them and variables that are inferred to be net-endian. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-28[IPV4]: inet_select_addr() annotationsAl Viro
argument and return value are net-endian. Annotated function and inferred net-endian variables in callers. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22[NET/IPV4/IPV6]: Change some sysctl variables to __read_mostlyBrian Haley
Change net/core, ipv4 and ipv6 sysctl variables to __read_mostly. Couldn't actually measure any performance increase while testing (.3% I consider noise), but seems like the right thing to do. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22[NET]: Replace CHECKSUM_HW by CHECKSUM_PARTIAL/CHECKSUM_COMPLETEPatrick McHardy
Replace CHECKSUM_HW by CHECKSUM_PARTIAL (for outgoing packets, whose checksum still needs to be completed) and CHECKSUM_COMPLETE (for incoming packets, device supplied full checksum). Patch originally from Herbert Xu, updated by myself for 2.6.18-rc3. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22[MLSXFRM]: Add flow labelingVenkat Yekkirala
This labels the flows that could utilize IPSec xfrms at the points the flows are defined so that IPSec policy and SAs at the right label can be used. The following protos are currently not handled, but they should continue to be able to use single-labeled IPSec like they currently do. ipmr ip_gre ipip igmp sit sctp ip6_tunnel (IPv6 over IPv6 tunnel device) decnet Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-30Remove obsolete #include <linux/config.h>Jörn Engel
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-17[IPV4] icmp: Kill local 'ip' arg in icmp_redirect().David S. Miller
It is typed wrong, and it's only assigned and used once. So just pass in iph->daddr directly which fixes both problems. Based upon a patch by Alexey Dobriyan. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-11[PATCH] for_each_possible_cpu: network codesKAMEZAWA Hiroyuki
for_each_cpu() actually iterates across all possible CPUs. We've had mistakes in the past where people were using for_each_cpu() where they should have been iterating across only online or present CPUs. This is inefficient and possibly buggy. We're renaming for_each_cpu() to for_each_possible_cpu() to avoid this in the future. This patch replaces for_each_cpu with for_each_possible_cpu under /net Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25[IPV4]: Aggregate route entries with different TOS valuesIlia Sotnikov
When we get an ICMP need-to-frag message, the original TOS value in the ICMP payload cannot be used as a key to look up the routes to update. This is because the TOS field may have been modified by routers on the way. Similarly, ip_rt_redirect should also ignore the TOS as the router that gave us the message may have modified the TOS value. The patch achieves this objective by aggregating entries with different TOS values (but are otherwise identical) into the same bucket. This makes it easy to update them at the same time when an ICMP message is received. In future we should use a twin-hashing scheme where teh aggregation occurs at the entry level. That is, the TOS goes back into the hash for normal lookups while ICMP lookups will end up with a node that gives us a list that contains all other route entries that differ only by TOS. Signed-off-by: Ilia Sotnikov <hostcc@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-13[IPV4] ICMP: Invert default for invalid icmp msgs sysctlDave Jones
isic can trigger these msgs to be spewed at a very high rate. There's already a sysctl to turn them off. Given these messages aren't useful for most people, this patch disables them by default. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-04[ICMP]: Fix extra dst release when ip_options_echo failsHerbert Xu
When two ip_route_output_key lookups in icmp_send were combined I forgot to change the error path for ip_options_echo to not drop the dst reference since it now sits before the dst lookup. To fix it we simply jump past the ip_rt_put call. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-02[IPV4]: Remove suprious use of goto out: in icmp_replyHorms
This seems to be an artifact of the follwoing commit in February '02. e7e173af42dbf37b1d946f9ee00219cb3b2bea6a In a nutshell, goto out and return actually do the same thing, and both are called in this function. This patch removes out. Signed-Off-By: Horms <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09[NET]: Change some "if (x) BUG();" to "BUG_ON(x);"Kris Katterjohn
This changes some simple "if (x) BUG();" statements to "BUG_ON(x);" Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03[INET_SOCK]: Move struct inet_sock & helper functions to net/inet_sock.hArnaldo Carvalho de Melo
To help in reducing the number of include dependencies, several files were touched as they were getting needed headers indirectly for stuff they use. Thanks also to Alan Menegotto for pointing out that net/dccp/proto.c had linux/dccp.h include twice. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-29[NET]: Add const markers to various variables.Arjan van de Ven
the patch below marks various variables const in net/; the goal is to move them to the .rodata section so that they can't false-share cachelines with things that get written to, as well as potentially helping gcc a bit with optimisations. (these were found using a gcc patch to warn about such variables) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10[NET]: Detect hardware rx checksum faults correctlyHerbert Xu
Here is the patch that introduces the generic skb_checksum_complete which also checks for hardware RX checksum faults. If that happens, it'll call netdev_rx_csum_fault which currently prints out a stack trace with the device name. In future it can turn off RX checksum. I've converted every spot under net/ that does RX checksum checks to use skb_checksum_complete or __skb_checksum_complete with the exceptions of: * Those places where checksums are done bit by bit. These will call netdev_rx_csum_fault directly. * The following have not been completely checked/converted: ipmr ip_vs netfilter dccp This patch is based on patches and suggestions from Stephen Hemminger and David S. Miller. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-25[NET]: Wider use of for_each_*cpu()John Hawkes
In 'net' change the explicit use of for-loops and NR_CPUS into the general for_each_cpu() or for_each_online_cpu() constructs, as appropriate. This widens the scope of potential future optimizations of the general constructs, as well as takes advantage of the existing optimizations of first_cpu() and next_cpu(), which is advantageous when the true CPU count is much smaller than NR_CPUS. Signed-off-by: John Hawkes <hawkes@sgi.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-03[IPV4]: Update icmp sysctl docs and disable broadcast ECHO/TIMESTAMP by defaultDavid S. Miller
It's not a good idea to be smurf'able by default. The few people who need this can turn it on. Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[NET]: use __read_mostly on kmem_cache_t , DEFINE_SNMP_STAT pointersEric Dumazet
This patch puts mostly read only data in the right section (read_mostly), to help sharing of these data between CPUS without memory ping pongs. On one of my production machine, tcp_statistics was sitting in a heavily modified cache line, so *every* SNMP update had to force a reload. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29[NET]: Make NETDEBUG pure printk wrappersPatrick McHardy
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>