aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2008-01-28[CCID2]: Remove redundant synchronisation variableGerrit Renker
This removes the synchronisation variable `ccid2hctx_sendwait', which is set to 1 when the CCID2 sender may send a new packet, and which is set to 0 otherwise The variable is redundant, since it is only used in combination with the hc_tx_send_packet/ hc_tx_packet_sent function pair. Both functions are called under socket lock, so the following happens when the CCID2 may send a new packet: * it sets sendwait = 1 in tx_send_packet and returns 0; * the subsequent call to tx_packet_sent clears the sendwait flag; * since tx_send_packet returns 0 if and only if sendwait == 1, the BUG_ON condition in tx_packet_sent is never satisfied, since that function is never called when tx_send_packet returns a value different from 0 (cf. dccp_write_xmit); * the call to tx_packet_sent clears the flag so that the condition "!sendwait" is true the next time tx_packet_sent is called. In other words, it is sufficient to just return 0 / not-0 to synchronise tx_send_packet and tx_packet_sent -- which is what the patch does. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Redundant debugging outputGerrit Renker
This reduces the amount of redundant debugging messages: * pipe/cwnd are printed in both tx_send_packet() and tx_packet_sent(). Both functions are called immediately after one another, so one occurrence is sufficient. * Since tx_packet_sent() prints pipe/cwnd already, the second printk for pipe is redundant. * In tx_packet_sent() the check_sanity function is called twice (at the begin and at the end). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Replace pipe assignment-function with assignmentGerrit Renker
The function ccid2_change_pipe only does an assignment. This patch simplifies the code by replacing the function with the assignment it performs. Furthermore, the type of pipe is promoted from `signed' to unsigned (increasing the range). As a result, a BUG_ON test for negative values now becomes obsolete (for safety not removed, but replaced with a less annoying `DCCP_BUG'). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Replace cwnd assignment-function with assignmentGerrit Renker
The current function ccid2_change_cwnd in effect makes only an assignment, as the test whether cwnd has reached 0 is only required when cwnd is halved. This patch simplifies the code by replacing the function with the assignment it performs. Furthermore, since ssthresh derives from cwnd and appears in many assignments and comparisons, the type of ssthresh has also been changed to match that of cwnd. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Replace read-only variable with constantGerrit Renker
This replaces the field member `numdupack', which was used as a read-only constant in the code, with a #define. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Remove unused variableGerrit Renker
This removes a variable `ccid2hctx_sent' which is incremented but never referenced/read (i.e., dead code). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Disable broken Ack Ratio adaptation algorithmGerrit Renker
This comments out a problematic section comprising a half-finished algorithm: - The variable `ccid2hctx_ackloss' is never initialised to a value different from 0 and hence in fact is a read-only constant. - The `arsent' variable counts packets other than Acks (it is incremented for every packet), and there is no test for Ack Loss. - The concept of counting Acks as such leads to a complex calculation, and the calculation at the moment is inconsistent with this concept. The problem is that the number of Acks - rather than the number of windows - is counted, which leads to a complex (cubic/quadratic) expression - this is not even implemented. In its current state, the commented-out algorithm interfers with normal processing by changing Ack Ratio incorrectly, and at the wrong times. A new algorithm is necessary, which will not necessarily use the same variables as used by the unfinished one; hence the old variables have been removed. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Larger initial windows also for CCID2Gerrit Renker
RFC 4341, sec. 5 states that "The cwnd parameter is initialized to at most four packets for new connections, following the rules from [RFC3390]", which is implemented by this patch. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Initialize dccp_sock before calling the ccid constructorsArnaldo Carvalho de Melo
This is because in the next patch CCID2 will assume that dccps_mss_cache is non-zero. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Deadlock and spurious timeouts when Ack Ratio > cwndGerrit Renker
This patch removes a bug in the current code. I agree with Andrea's comment that there is a problem here but the way it is treated does not fix it. The problem is that whenever Ack Ratio > cwnd, starvation/deadlock occurs: * the receiver will not send an Ack until (Ack Ratio - cwnd) data packets have arrived; * the sender will not send any data packet before the receipt of an Ack advances the send window. The only way that the connection then progresses was via RTO timeout. In one extreme case (bulk transfer), it was observed that this happened for every single packet; i.e. hundreds of packets, each a RTO timeout of 1..3 seconds apart: a transfer which normally would take a fraction of a second thus grew to several minutes. The solution taken by this approach is to observe the relation "Ack Ratio <= cwnd" by using the constraint (1) from RFC 4341, 6.1.2; i.e. set Ack Ratio = ceil(cwnd / 2) and update it whenever either Ack Ratio or cwnd change. This ensures that the deadlock problem can not arise. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Don't assign negative values to Ack RatioGerrit Renker
Since it makes not sense to assign negative values to Ack Ratio, this patch disallows this possibility. As a consequence, a Bug test for negative Ack Ratio values becomes obsolete. Furthermore, a check against overflow (as Ack Ratio may not exceed 2 bytes, due to RFC 4340, 11.3) has been added. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Fix sequence number arithmetic/comparisonsGerrit Renker
This replaces use of normal subtraction with modulo-48 subtraction. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID2]: Bug in reading Ack VectorsGerrit Renker
In CCID2 the receiver-history is sorted in ascending order of sequence number, but the processing of received Ack Vectors requires the list traversal in the opposite direction. The current code has a bug in this regard: the list traversal is upwards. As a consequence, only Ack Vectors with a run length of 1 will pass, in all other Ack Vectors the remaining (acked) sequence numbers are missed, and may later falsely be identified as lost. Note: This bug is only visible when Ack Ratio > 1, since otherwise the run lengths of Ack Vectors are 0. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[ACKVEC]: Reduce length of identifiersGerrit Renker
This is reduces the length of the struct ackvec/ackvec_record fields. It is a purely text-based replacement: s#dccpavr_#avr_#g; s#dccpav_#av_#g; and increases readability somewhat. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[PCOUNTER] Fix build error without CONFIG_SMPIlpo Järvinen
I keep getting this build error and couldn't find anyone fixing it in archives. ...Maybe all net developers except me build just SMP kernels :-). In file included from include/net/sock.h:50, from ipc/mqueue.c:35: include/linux/pcounter.h: In function 'pcounter_add': include/linux/pcounter.h:87: error: 'struct pcounter' has no member named 'value' make[1]: *** [ipc/mqueue.o] Error 1 make: *** [ipc] Error 2 Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPV6]: Correct the comment concerning inetsw6 tablePavel Emelyanov
It seems that net/ipv6/af_inet6.c was copied from net/ipv4/af_inet.c, but one comment was not fixed. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[UNIX] Move the unix sock iterators in to proper placePavel Emelyanov
The first_unix_socket() and next_unix_sockets() are now used in proc file and in forall_unix_socets macro only. The forall_unix_sockets is not used in this file at all so remove it. After this move the helpers to where they really belong, i.e. closer to proc code under the #ifdef CONFIG_PROC_FS option. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Update documentation on ioctlsGerrit Renker
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Ignore Ack Vectors / Elapsed Time on DCCP-Request alsoGerrit Renker
Small update with regard to RFC 4340 (references added as documentation): on Requests, Ack Vectors / Elapsed Time should be ignored. Length handling of Elapsed Time also simplified. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Remove redundant dependency on IP_DCCPGerrit Renker
This cleans up the consequences of an earlier patch which introduced the `if IP_DCCP' clause into net/dccp/Kconfig. The CCID Kconfig menu is sourced within this clause; as a consequence, all tests of type `depends on IP_DCCP' are now redundant. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Promote CCID2 as default CCIDGerrit Renker
This patch addresses the following problems: 1. DCCP relies for its proper functioning on having at least one CCID module enabled (as in TCP plugable congestion control). Currently it is possible to disable both CCIDs and thus leave the DCCP module in a compiled, but entirely non-functional state: no sockets can be created when no CCID is available. Furthermore, the protocol is (again like TCP) not intended to be used without CCIDs. Last, a non-empty CCID list is needed for doing CCID feature negotiation. 2. Internally the default CCID that is advertised by the Linux host is set to CCID2 (DCCPF_INITIAL_CCID in include/linux/dccp.h). Disabling CCID2 in the Kconfig menu without changing the defaults leads to a failure `module not found' when trying to load the dccp module (which internally tries to load the default CCID). 3. The specification (RFC 4340, sec. 10) treats CCID2 somewhat like a `minimum common denominator'; the specification says that: * "New connections start with CCID 2 for both endpoints" * "A DCCP implementation intended for general use, such as an implementation in a general-purpose operating system kernel, SHOULD implement at least CCID 2. The intent is to make CCID 2 broadly available for interoperability [...]" Providing CCID2 as minimum-required CCID (like Reno/Cubic in TCP) thus seems reasonable. Hence this patch automatically selects CCID2 when DCCP is enabled. Documentation also added. Discussions with Ian McDonald on this subject are gratefully acknowledged. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Update documentationGerrit Renker
This updates the DCCP documentation, following input from Ian McDonald, clarifiying the status of DCCP, and adding a note about the test tree. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Honour and make use of shutdown option set by userGerrit Renker
This extends the DCCP socket API by honouring any shutdown(2) option set by the user. The behaviour is, as much as possible, made consistent with the API for TCP's shutdown. This patch exploits the information provided by the user via the socket API to reduce processing costs: * if the read end is closed (SHUT_RD), it is not necessary to deliver to input CCID; * if the write end is closed (SHUT_WR), the same idea applies, but with a difference - as long as the TX queue has not been drained, we need to receive feedback to keep congestion-control rates up to date. Hence SHUT_WR is honoured only after the last packet (under congestion control) has been sent; * although SHUT_RDWR seems nonsensical, it is nevertheless supported in the same manner as for TCP (and agrees with test for SHUTDOWN_MASK in dccp_poll() in net/dccp/proto.c). Furthermore, most of the code already honours the sk_shutdown flags (dccp_recvmsg() for instance sets the read length to 0 if SHUT_RD had been called); CCID handling is now added to this by the present patch. There will also no longer be any delivery when the socket is in the final stages, i.e. when one of dccp_close(), dccp_fin(), or dccp_done() has been called - which is fine since at that stage the connection is its final stages. Motivation and background are on http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/shutdown A FIXME has been added to notify the other end if SHUT_RD has been set (RFC 4340, 11.7). Note: There is a comment in inet_shutdown() in net/ipv4/af_inet.c which asks to "make sure the socket is a TCP socket". This should probably be extended to mean `TCP or DCCP socket' (the code is also used by UDP and raw sockets). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[DCCP]: Make PARTOPEN an autonomous stateGerrit Renker
This decouples PARTOPEN from TCP-specific stream-states. It thus addresses the FIXME. The code has been checked with regard to dependency on PARTOPEN and FIN_WAIT1 states (to which PARTOPEN previously was mapped): there is no difference, as PARTOPEN is always referred to directly (i.e. not via the mapping to TCP state). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: Inline for moving averageGerrit Renker
The moving average computation occurs so frequently in the CCID 3 code that it merits an inline function of its own. This is uses a suggestion by Arnaldo as per http://www.mail-archive.com/dccp@vger.kernel.org/msg01662.html Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: Accurately determine idle & application-limited periodsGerrit Renker
This fixes/updates the handling of idle and application-limited periods in CCID3, which currently is broken: there is no detection as to how long a sender has been idle - there is only one flag which is toggled in between function calls. Being obsolete now, the `idle' flag is removed. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: Ignore trivial amounts of elapsed timeGerrit Renker
This patch fixes a previously undiscovered bug; the problem is in computing the elapsed time as the time between `receiving' the packet (i.e. skb enters CCID module) and sending feedback: - there is no layer-processing, queueing, or delay involved, - hence the elapsed time is in the order of 1 function call - this is in the dimension of maximally 50..100usec - which renders the use of elapsed time almost entirely useless. The fix is simply to ignore such trivial amounts of elapsed time. As a further advantage, the now useless elapsed_time field can be removed from the socket, which reduces the socket structure by another four bytes. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: Revert use of MSS instead of sGerrit Renker
This updates the CCID3 code with regard to two instances of using `MSS' in place of `s': 1. The RFC3390-based initial rate: both rfc3448bis as well as the Faster Restart draft now consistently use `s' instead of MSS. 2. Now agrees with section 4.2 of rfc3448bis: "If the sender is ready to send data when it does not yet have a round trip sample, the value of X is set to s bytes per second, for segment size s [...]" Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NET] proto: Use pcounters for the inuse fieldArnaldo Carvalho de Melo
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[LIB]: Introduce struct pcounterArnaldo Carvalho de Melo
This just generalises what was introduced by Eric Dumazet for the struct proto inuse field in 286ab3d46058840d68e5d7d52e316c1f7e98c59f: [NET]: Define infrastructure to keep 'inuse' changes in an efficent SMP/NUMA way. Please look at the comment in there to see the rationale. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28mac80211: remove more forgotten codeJohannes Berg
Hopefully that's the rest. Seems I didn't do a very thorough job removing the management interface. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28mac80211: adding 802.11n definitions in ieee80211.hRon Rindjunsky
This patch adds several structs and definitions to ieee80211.h to support 802.11n draft specifications. As 802.11n depends on and extends the 802.11e standard in several issues, there are also several definitions that belong to 802.11e. Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28mac80211: Remove local->scan_flagsHelmut Schaa
This patch removes all references to local->scan_flags as these are not used anymore since the removal of prism2 ioctls. Signed-off-by: Helmut Schaa <hschaa@suse.de> Signed-off-by: Jiri Benc <jbenc@suse.cz> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28mac80211: provide interface iterator for driversJohannes Berg
Sometimes drivers need to know which interfaces are associated with their hardware. Rather than forcing those drivers to keep track of the interfaces that were added, this adds an iteration function to mac80211. As it is intended to be used from the interface add/remove callbacks, the iteration function may currently only be called under RTNL. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NET]: Compact sk_stream_mem_schedule() codePavel Emelyanov
This function references sk->sk_prot->xxx for many times. It turned out, that there's so many code in it, that gcc cannot always optimize access to sk->sk_prot's fields. After saving the sk->sk_prot on the stack and comparing disassembled code, it turned out that the function became ~10 bytes shorter and made less dereferences (on i386 and x86_64). Stack consumption didn't grow. Besides, this patch drives most of this function into the 80 columns limit. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NET]: Make netns cleanup to run in a separate queueBenjamin Thery
This patch adds a separate workqueue for cleaning up a network namespace. If we use the keventd workqueue to execute cleanup_net(), there is a problem to unregister devices in IPv6. Indeed the code that cleans up also schedule work in keventd: as long as cleanup_net() hasn't return, dst_gc_task() cannot run and as long as dst_gc_task() has not run, there are still some references pending on the net devices and cleanup_net() can not unregister and exit the keventd workqueue. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Acked-by: Denis V. Lunev <den@openvz.org> Acked-By: Kirill Korotaev <dev@sw.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPVS]: Relax the module get/put in ip_vs_app.cPavel Emelyanov
Both try_module_get/module_put already handle the module == NULL case, so no need in manual checking. This patch fits both net-2.6 and net-2.6.25. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TUN]: Use iov_length()Akinobu Mita
Use iov_length() instead of tun's homemade iov_total(). Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NET] net/core/request_sock.c: Remove unused exports.Adrian Bunk
This patch removes the following unused EXPORT_SYMBOL's: - reqsk_queue_alloc - __reqsk_queue_destroy - reqsk_queue_destroy Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[PATCH] IPV4 : Move ip route cache flush (secret_rebuild) from softirq to ↵Eric Dumazet
workqueue Every 600 seconds (ip_rt_secret_interval), a softirq flush of the whole ip route cache is triggered. On loaded machines, this can starve softirq for many seconds and can eventually crash. This patch moves this flush to a workqueue context, using the worker we intoduced in commit 39c90ece7565f5c47110c2fa77409d7a9478bd5b (IPV4: Convert rt_check_expire() from softirq processing to workqueue.) Also, immediate flushes (echo 0 >/proc/sys/net/ipv4/route/flush) are using rt_do_flush() helper function, wich take attention to rescheduling. Next step will be to handle delayed flushes ("echo -1 >/proc/sys/net/ipv4/route/flush" or "ip route flush cache") Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[RAW]: Consolidate proc interface.Pavel Emelyanov
Both ipv6/raw.c and ipv4/raw.c use the seq files to walk through the raw sockets hash and show them. The "walking" code is rather huge, but is identical in both cases. The difference is the hash table to walk over and the protocol family to check (this was not in the first virsion of the patch, which was noticed by YOSHIFUJI) Make the ->open store the needed hash table and the family on the allocated raw_iter_state and make the start/next/stop callbacks work with it. This removes most of the code. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[RAW]: Consolidate proto->unhash callbackPavel Emelyanov
Same as the ->hash one, this is easily consolidated. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[RAW]: Consolidate proto->hash callbackPavel Emelyanov
Having the raw_hashinfo it's easy to consolidate the raw[46]_hash functions. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[RAW]: Introduce raw_hashinfo structurePavel Emelyanov
The ipv4/raw.c and ipv6/raw.c contain many common code (most of which is proc interface) which can be consolidated. Most of the places to consolidate deal with the raw sockets hashtable, so introduce a struct raw_hashinfo which describes the raw sockets hash. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPv6] RAW: Compact the API for the kernelPavel Emelyanov
Same as in the previous patch for ipv4, compact the API and hide hash table and rwlock inside the raw.c file. Plus fix some "bad" places from checkpatch.pl point of view (assignments inside if()). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[IPv4] RAW: Compact the API for the kernelPavel Emelyanov
The raw sockets functions are explicitly used from inside the kernel in two places: 1. in ip_local_deliver_finish to intercept skb-s 2. in icmp_error For this purposes many functions and even data structures, that are naturally internal for raw protocol, are exported. Compact the API to two functions and hide all the other (including hash table and rwlock) inside the net/ipv4/raw.c Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NET]: Consolidate net namespace related proc files creation.Denis V. Lunev
Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NET]: Make AF_UNIX per network namespace safe [v2]Denis V. Lunev
Because of the global nature of garbage collection, and because of the cost of per namespace hash tables unix_socket_table has been kept global. With a filter added on lookups so we don't see sockets from the wrong namespace. Currently I don't fold the namesapce into the hash so multiple namespaces using the same socket name will be guaranteed a hash collision. Changes from v1: - fixed unix_seq_open Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NET]: Make AF_PACKET handle multiple network namespacesDenis V. Lunev
This is done by making packet_sklist_lock and packet_sklist per network namespace and adding an additional filter condition on received packets to ensure they came from the proper network namespace. Changes from v1: - prohibit to call inet_dgram_ops.ioctl in other than init_net Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NET]: Make the netlink methods in rtnetlink handle multiple network namespacesEric W. Biederman
After the previous prep work this just consists of removing checks limiting the code to work in the initial network namespace, and updating rtmsg_ifinfo so we can generate events for devices in something other then the initial network namespace. Referring to network other network devices like the IFLA_LINK and IFLA_MASTER attributes do, gets interesting if those network devices happen to be in other network namespaces. Currently ifindex numbers are allocated globally so I have taken the path of least resistance and not still report the information even though the devices they are talking about are invisible. If applications start getting confused or when ifindex numbers become local to the network namespace we may need to do something different in the future. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Denis V. Lunev <den@openz.org> Signed-off-by: David S. Miller <davem@davemloft.net>