aboutsummaryrefslogtreecommitdiff
path: root/net/ipv6
AgeCommit message (Collapse)Author
2010-05-05IPv6: fix IPV6_RECVERR handling of locally-generated errorsBrian Haley
I noticed when I added support for IPV6_DONTFRAG that if you set IPV6_RECVERR and tried to send a UDP packet larger than 64K to an IPv6 destination, you'd correctly get an EMSGSIZE, but reading from MSG_ERRQUEUE returned the incorrect address in the cmsg: struct msghdr: msg_name 0x7fff8f3c96d0 msg_namelen 28 struct sockaddr_in6: sin6_family 10 sin6_port 7639 sin6_flowinfo 0 sin6_addr ::ffff:38.32.0.0 sin6_scope_id 0 ((null)) It should have returned this in my case: struct msghdr: msg_name 0x7fffd866b510 msg_namelen 28 struct sockaddr_in6: sin6_family 10 sin6_port 7639 sin6_flowinfo 0 sin6_addr 2620:0:a09:e000:21f:29ff:fe57:f88b sin6_scope_id 0 ((null)) The problem is that ipv6_recv_error() assumes that if the error wasn't generated by ICMPv6, it's an IPv4 address sitting there, and proceeds to create a v4-mapped address from it. Change ipv6_icmp_error() and ipv6_local_error() to set skb->protocol to htons(ETH_P_IPV6) so that ipv6_recv_error() knows the address sitting right after the extended error is IPv6, else it will incorrectly map the first octet into an IPv4-mapped IPv6 address in the cmsg structure returned in a recvmsg() call to obtain the error. Signed-off-by: Brian Haley <brian.haley@hp.com> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-03ipv6: Fix default multicast hops setting.David S. Miller
As per RFC 3493 the default multicast hops setting for a socket should be "1" just like ipv4. Ironically we have a IPV6_DEFAULT_MCASTHOPS macro it just wasn't being used. Reported-by: Elliot Hughes <enh@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-28Revert "tcp: bind() fix when many ports are bound"David S. Miller
This reverts two commits: fda48a0d7a8412cedacda46a9c0bf8ef9cd13559 tcp: bind() fix when many ports are bound and a follow-on fix for it: 6443bb1fc2050ca2b6585a3fa77f7833b55329ed ipv6: Fix inet6_csk_bind_conflict() It causes problems with binding listening sockets when time-wait sockets from a previous instance still are alive. It's too late to keep fiddling with this so late in the -rc series, and we'll deal with it in net-next-2.6 instead. Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-25ipv6: Fix inet6_csk_bind_conflict()Eric Dumazet
Commit fda48a0d7a84 (tcp: bind() fix when many ports are bound) introduced a bug on IPV6 part. We should not call ipv6_addr_any(inet6_rcv_saddr(sk2)) but ipv6_addr_any(inet6_rcv_saddr(sk)) because sk2 can be IPV4, while sk is IPV6. Reported-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-22tcp: bind() fix when many ports are boundEric Dumazet
Port autoselection done by kernel only works when number of bound sockets is under a threshold (typically 30000). When this threshold is over, we must check if there is a conflict before exiting first loop in inet_csk_get_port() Change inet_csk_bind_conflict() to forbid two reuse-enabled sockets to bind on same (address,port) tuple (with a non ANY address) Same change for inet6_csk_bind_conflict() Reported-by: Gaspar Chilingarov <gasparch@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-21net: ipv6 bind to device issueJiri Olsa
The issue raises when having 2 NICs both assigned the same IPv6 global address. If a sender binds to a particular NIC (SO_BINDTODEVICE), the outgoing traffic is being sent via the first found. The bonded device is thus not taken into an account during the routing. From the ip6_route_output function: If the binding address is multicast, linklocal or loopback, the RT6_LOOKUP_F_IFACE bit is set, but not for global address. So binding global address will neglect SO_BINDTODEVICE-binded device, because the fib6_rule_lookup function path won't check for the flowi::oif field and take first route that fits. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Scott Otto <scott.otto@alcatel-lucent.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-21ipv6: allow to send packet after receiving ICMPv6 Too Big message with MTU ↵Shan Wei
field less than IPV6_MIN_MTU According to RFC2460, PMTU is set to the IPv6 Minimum Link MTU (1280) and a fragment header should always be included after a node receiving Too Big message reporting PMTU is less than the IPv6 Minimum Link MTU. After receiving a ICMPv6 Too Big message reporting PMTU is less than the IPv6 Minimum Link MTU, sctp *can't* send any data/control chunk that total length including IPv6 head and IPv6 extend head is less than IPV6_MIN_MTU(1280 bytes). The failure occured in p6_fragment(), about reason see following(take SHUTDOWN chunk for example): sctp_packet_transmit (SHUTDOWN chunk, len=16 byte) |------sctp_v6_xmit (local_df=0) |------ip6_xmit |------ip6_output (dst_allfrag is ture) |------ip6_fragment In ip6_fragment(), for local_df=0, drops the the packet and returns EMSGSIZE. The patch fixes it with adding check length of skb->len. In this case, Ipv6 not to fragment upper protocol data, just only add a fragment header before it. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-21xfrm6: ensure to use the same dev when building a bundleNicolas Dichtel
When building a bundle, we set dst.dev and rt6.rt6i_idev. We must ensure to set the same device for both fields. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-21ipv6: Fix tcp_v6_send_response transport header setting.Herbert Xu
My recent patch to remove the open-coded checksum sequence in tcp_v6_send_response broke it as we did not set the transport header pointer on the new packet. Actually, there is code there trying to set the transport header properly, but it sets it for the wrong skb ('skb' instead of 'buff'). This bug was introduced by commit a8fdf2b331b38d61fb5f11f3aec4a4f9fb2dedcb ("ipv6: Fix tcp_v6_send_response(): it didn't set skb transport header") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-15ip: Fix ip_dev_loopback_xmit()Eric Dumazet
Eric Paris got following trace with a linux-next kernel [ 14.203970] BUG: using smp_processor_id() in preemptible [00000000] code: avahi-daemon/2093 [ 14.204025] caller is netif_rx+0xfa/0x110 [ 14.204035] Call Trace: [ 14.204064] [<ffffffff81278fe5>] debug_smp_processor_id+0x105/0x110 [ 14.204070] [<ffffffff8142163a>] netif_rx+0xfa/0x110 [ 14.204090] [<ffffffff8145b631>] ip_dev_loopback_xmit+0x71/0xa0 [ 14.204095] [<ffffffff8145b892>] ip_mc_output+0x192/0x2c0 [ 14.204099] [<ffffffff8145d610>] ip_local_out+0x20/0x30 [ 14.204105] [<ffffffff8145d8ad>] ip_push_pending_frames+0x28d/0x3d0 [ 14.204119] [<ffffffff8147f1cc>] udp_push_pending_frames+0x14c/0x400 [ 14.204125] [<ffffffff814803fc>] udp_sendmsg+0x39c/0x790 [ 14.204137] [<ffffffff814891d5>] inet_sendmsg+0x45/0x80 [ 14.204149] [<ffffffff8140af91>] sock_sendmsg+0xf1/0x110 [ 14.204189] [<ffffffff8140dc6c>] sys_sendmsg+0x20c/0x380 [ 14.204233] [<ffffffff8100ad82>] system_call_fastpath+0x16/0x1b While current linux-2.6 kernel doesnt emit this warning, bug is latent and might cause unexpected failures. ip_dev_loopback_xmit() runs in process context, preemption enabled, so must call netif_rx_ni() instead of netif_rx(), to make sure that we process pending software interrupt. Same change for ip6_dev_loopback_xmit() Reported-by: Eric Paris <eparis@redhat.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-11Merge branch 'master' of /home/davem/src/GIT/linux-2.6/David S. Miller
2010-04-08udp: fix for unicast RX path optimizationJorge Boncompte [DTI2]
Commits 5051ebd275de672b807c28d93002c2fb0514a3c9 and 5051ebd275de672b807c28d93002c2fb0514a3c9 ("ipv[46]: udp: optimize unicast RX path") broke some programs. After upgrading a L2TP server to 2.6.33 it started to fail, tunnels going up an down, after the 10th tunnel came up. My modified rp-l2tp uses a global unconnected socket bound to (INADDR_ANY, 1701) and one connected socket per tunnel after parameter negotiation. After ten sockets were open and due to mixed parameters to udp[46]_lib_lookup2() kernel started to drop packets. Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-28ipv6: Don't drop cache route entry unless timer actually expired.YOSHIFUJI Hideaki / 吉藤英明
This is ipv6 variant of the commit 5e016cbf6.. ("ipv4: Don't drop redirected route cache entry unless PTMU actually expired") by Guenter Roeck <guenter.roeck@ericsson.com>. Remove cache route entry in ipv6_negative_advice() only if the timer is expired. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-27net: ipmr/ip6mr: prevent out-of-bounds vif_table accessNicolas Dichtel
When cache is unresolved, c->mf[6]c_parent is set to 65535 and minvif, maxvif are not initialized, hence we must avoid to parse IIF and OIF. A second problem can happen when the user dumps a cache entry where a VIF, that was referenced at creation time, has been removed. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-26net: fix netlink address dumping in IPv4/IPv6Patrick McHardy
When a dump is interrupted at the last device in a hash chain and then continued, "idx" won't get incremented past s_idx, so s_ip_idx is not reset when moving on to the next device. This means of all following devices only the last n - s_ip_idx addresses are dumped. Tested-by: Pawel Staszewski <pstaszewski@itcare.pl> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-03-25Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6
2010-03-25netfilter: ip6table_raw: fix table priorityJozsef Kadlecsik
The order of the IPv6 raw table is currently reversed, that makes impossible to use the NOTRACK target in IPv6: for example if someone enters ip6tables -t raw -A PREROUTING -p tcp --dport 80 -j NOTRACK and if we receive fragmented packets then the first fragment will be untracked and thus skip nf_ct_frag6_gather (and conntrack), while all subsequent fragments enter nf_ct_frag6_gather and reassembly will never successfully be finished. Singed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-03-19net: ipmr/ip6mr: fix potential out-of-bounds vif_table accessPatrick McHardy
mfc_parent of cache entries is used to index into the vif_table and is initialised from mfcctl->mfcc_parent. This can take values of to 2^16-1, while the vif_table has only MAXVIFS (32) entries. The same problem affects ip6mr. Refuse invalid values to fix a potential out-of-bounds access. Unlike the other validity checks, this is checked in ipmr_mfc_add() instead of the setsockopt handler since its unused in the delete path and might be uninitialized. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-19ipv6: Remove redundant dst NULL check in ip6_dst_checkHerbert Xu
As the only path leading to ip6_dst_check makes an indirect call through dst->ops, dst cannot be NULL in ip6_dst_check. This patch removes this check in case it misleads people who come across this code. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-13ipv6: Send netlink notification when DAD failsHerbert Xu
If we are managing IPv6 addresses using DHCP, it would be nice for user-space to be notified if an address configured through DHCP fails DAD. Otherwise user-space would have to poll to see whether DAD succeeds. This patch uses the existing notification mechanism and simply hooks it into the DAD failure code path. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-08tcp: Add SNMP counters for backlog and min_ttl dropsEric Dumazet
Commit 6b03a53a (tcp: use limited socket backlog) added the possibility of dropping frames when backlog queue is full. Commit d218d111 (tcp: Generalized TTL Security Mechanism) added the possibility of dropping frames when TTL is under a given limit. This patch adds new SNMP MIB entries, named TCPBacklogDrop and TCPMinTTLDrop, published in /proc/net/netstat in TcpExt: line netstat -s | egrep "TCPBacklogDrop|TCPMinTTLDrop" TCPBacklogDrop: 0 TCPMinTTLDrop: 0 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-07ipv6: Optmize translation between IPV6_PREFER_SRC_xxx and RT6_LOOKUP_F_xxx.YOSHIFUJI Hideaki / 吉藤英明
IPV6_PREFER_SRC_xxx definitions: | #define IPV6_PREFER_SRC_TMP 0x0001 | #define IPV6_PREFER_SRC_PUBLIC 0x0002 | #define IPV6_PREFER_SRC_COA 0x0004 RT6_LOOKUP_F_xxx definitions: | #define RT6_LOOKUP_F_SRCPREF_TMP 0x00000008 | #define RT6_LOOKUP_F_SRCPREF_PUBLIC 0x00000010 | #define RT6_LOOKUP_F_SRCPREF_COA 0x00000020 So, we can translate between these two groups by shift operation instead of multiple 'if's. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-05net: backlog functions renameZhu Yi
sk_add_backlog -> __sk_add_backlog sk_add_backlog_limited -> sk_add_backlog Signed-off-by: Zhu Yi <yi.zhu@intel.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-05udp: use limited socket backlogZhu Yi
Make udp adapt to the limited socket backlog change. Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: "Pekka Savola (ipv6)" <pekkas@netcore.fi> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Zhu Yi <yi.zhu@intel.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-05tcp: use limited socket backlogZhu Yi
Make tcp adapt to the limited socket backlog change. Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: "Pekka Savola (ipv6)" <pekkas@netcore.fi> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Zhu Yi <yi.zhu@intel.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-04IPv6: fix race between cleanup and add/delete addressstephen hemminger
This solves a potential race problem during the cleanup process. The issue is that addrconf_ifdown() needs to traverse address list, but then drop lock to call the notifier. The version in -next could get confused if add/delete happened during this window. Original code (2.6.32 and earlier) was okay because all addresses were always deleted. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-04IPv6: addrconf notify when address is unavailablestephen hemminger
My recent change in net-next to retain permanent addresses caused regression. Device refcount would not go to zero when device was unregistered because left over anycast reference would hold ipv6 dev reference which would hold device references... The correct procedure is to call notify chain when address is no longer available for use. When interface comes back DAD timer will notify back that address is available. Also, link local addresses should be purged when interface is brought down. The address might be changed. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-04IPv6: addrconf timer racestephen hemminger
The Router Solicitation timer races with device state changes because it doesn't lock the device. Use local variable to avoid one repeated dereference. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-04IPv6: addrconf dad timer unnecessary bh_disablestephen hemminger
Timer code runs in bottom half, so there is no need for using _bh form of locking. Also check if device is not ready to avoid race with address that is no longer active. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-03ipsec: Fix bogus bundle flowiHerbert Xu
When I merged the bundle creation code, I introduced a bogus flowi value in the bundle. Instead of getting from the caller, it was instead set to the flow in the route object, which is totally different. The end result is that the bundles we created never match, and we instead end up with an ever growing bundle list. Thanks to Jamal for find this problem. Reported-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-26Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6
2010-02-26netfilter: xtables: restore indentationJan Engelhardt
Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-26ipv6: Use 1280 as min MTU for ipv6 forwardingUlrich Weber
Clients will set their MTU to 1280 if they receive a ICMPV6_PKT_TOOBIG message with an MTU less than 1280. To allow encapsulating of packets over a 1280 link we should always accept packets with a size of 1280 for forwarding even if the path has a lower MTU and fragment the encapsulated packets afterwards. In case a forwarded packet is not going to be encapsulated a ICMPV6_PKT_TOOBIG msg will still be send by ip6_fragment() with the correct MTU. Signed-off-by: Ulrich Weber <uweber@astaro.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-26ipv6: Remove IPV6_ADDR_RESERVEDUlrich Weber
RFC 4291 section 2.4 states that all uncategorized addresses should be considered as Global Unicast. This will remove IPV6_ADDR_RESERVED completely and return IPV6_ADDR_UNICAST in ipv6_addr_type() instead. Signed-off-by: Ulrich Weber <uweber@astaro.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-25Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
2010-02-24Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6
2010-02-24netfilter: xtables: reduce arguments to translate_tableJan Engelhardt
Just pass in the entire repl struct. In case of a new table (e.g. ip6t_register_table), the repldata has been previously filled with table->name and table->size already (in ip6t_alloc_initial_table). Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-24netfilter: xtables: optimize call flow around xt_ematch_foreachJan Engelhardt
Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-24netfilter: xtables: replace XT_MATCH_ITERATE macroJan Engelhardt
The macro is replaced by a list.h-like foreach loop. This makes the code more inspectable. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-24netfilter: xtables: optimize call flow around xt_entry_foreachJan Engelhardt
Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-24netfilter: xtables: replace XT_ENTRY_ITERATE macroJan Engelhardt
The macro is replaced by a list.h-like foreach loop. This makes the code much more inspectable. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-22xfrm: SA lookups signature with markJamal Hadi Salim
pass mark to all SA lookups to prepare them for when we add code to have them search. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-19net: Fix sysctl restarts...Eric W. Biederman
Yuck. It turns out that when we restart sysctls we were restarting with the values already changed. Which unfortunately meant that the second time through we thought there was no change and skipped all kinds of work, despite the fact that there was indeed a change. I have fixed this the simplest way possible by restoring the changed values when we restart the sysctl write. One of my coworkers spotted this bug when after disabling forwarding on an interface pings were still forwarded. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-19netfilter: nf_conntrack_reasm: properly handle packets fragmented into a ↵Patrick McHardy
single fragment When an ICMPV6_PKT_TOOBIG message is received with a MTU below 1280, all further packets include a fragment header. Unlike regular defragmentation, conntrack also needs to "reassemble" those fragments in order to obtain a packet without the fragment header for connection tracking. Currently nf_conntrack_reasm checks whether a fragment has either IP6_MF set or an offset != 0, which makes it ignore those fragments. Remove the invalid check and make reassembly handle fragment queues containing only a single fragment. Reported-and-tested-by: Ulrich Weber <uweber@astaro.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-18ipv6: drop unused "dev" arg of icmpv6_send()Alexey Dobriyan
Dunno, what was the idea, it wasn't used for a long time. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-18ipv6: use standard lists for FIB walksAlexey Dobriyan
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-18ipv6: remove stale MIB definitionsAlexey Dobriyan
ICMP6 MIB statistics was per-netns for quite a time. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-17IPv6: convert mc_lock to spinlockStephen Hemminger
Only used for writing, so convert to spinlock Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-17net: remove INIT_RCU_HEAD() usageAlexey Dobriyan
call_rcu() will unconditionally reinitialize RCU head anyway. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>