aboutsummaryrefslogtreecommitdiff
path: root/net/dccp/ccids/ccid3.c
AgeCommit message (Collapse)Author
2008-09-04dccp: Return-value convention of hc_tx_send_packet()Gerrit Renker
This patch reorganises the return value convention of the CCID TX sending function, to permit more flexible schemes, as required by subsequent patches. Currently the convention is * values < 0 mean error, * a value == 0 means "send now", and * a value x > 0 means "send in x milliseconds". The patch provides symbolic constants and a function to interpret return values. In addition, it caps the maximum positive return value to 0xFFFF milliseconds, corresponding to 65.535 seconds. This is possible since in CCID-3 the maximum inter-packet gap is t_mbi = 64 sec. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp ccid-3: Remove dead statesGerrit Renker
This patch is thanks to an investigation by Leandro Sales de Melo and his colleagues. They worked out two state diagrams which highlight the fact that the xxx_TERM states in CCID-3/4 are in fact not necessary. And this can be confirmed by in turn looking at the code: the xxx_TERM states are only ever set in ccid3_hc_{rx,tx}_exit(). These two functions are part of the following call chain: * ccid_hc_{tx,rx}_exit() are called from ccid_delete() only; * ccid_delete() invokes ccid_hc_{tx,rx}_exit() in the way of a destructor: after calling ccid_hc_{tx,rx}_exit(), the CCID is released from memory; * ccid_delete() is in turn called only by ccid_hc_{tx,rx}_delete(); * ccid_hc_{tx,rx}_delete() is called only if - feature negotiation failed (dccp_feat_activate_values()), - when changing the RX/TX CCID (to eject the current CCID), - when destroying the socket (in dccp_destroy_sock()). In other words, when CCID-3 sets the state to xxx_TERM, it is at a time where no more processing should be going on, hence it is not necessary to introduce a dedicated exit state - this is implicit when unloading the CCID. The patch removes this state, one switch-statement collapses as a result. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp: Unused argument in CCID tx functionGerrit Renker
This removes the argument `more' from ccid_hc_tx_packet_sent, since it was nowhere used in the entire code. (Anecdotally, this argument was not even used in the original KAME code where the function originally came from; compare the variable moreToSend in the freebsd61-dccp-kame-28.08.2006.patch now maintained by Emmanuel Lochin.) Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp ccid-3: Remove redundant 'options_received' structGerrit Renker
The `options_received' struct is redundant, since it re-duplicates the existing `p' and `x_recv' fields. This patch removes the sub-struct and migrates the format conversion operations (cf. below) to ccid3_hc_tx_parse_options(). Why the fields are redundant ---------------------------- The Loss Event Rate p and the Receive Rate x_recv are initially 0 when first loading CCID-3, as ccid_new() zeroes out the entire ccid3_hc_tx_sock. When Loss Event Rate or Receive Rate options are received, they are stored by ccid3_hc_tx_parse_options() into the fields `ccid3or_loss_event_rate' and `ccid3or_receive_rate' of the sub-struct `options_received' in ccid3_hc_tx_sock. After parsing (considering only the established state - dccp_rcv_established()), the packet is passed on to ccid_hc_tx_packet_recv(). This calls the CCID-3 specific routine ccid3_hc_tx_packet_recv(), which performs the following copy operations between fields of ccid3_hc_tx_sock: * hctx->options_received.ccid3or_receive_rate is copied into hctx->x_recv, after scaling it for fixpoint arithmetic, by 2^64; * hctx->options_received.ccid3or_loss_event_rate is copied into hctx->p, considering the above special cases; in addition, a value of 0 here needs to be mapped into p=0 (when no Loss Event Rate option has been received yet). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp tfrc/ccid-3: Computing Loss Rate from Loss Event RateGerrit Renker
This adds a function to take care of the following cases occurring in the computation of the Loss Rate p: * 1/(2^32-1) is mapped into 0% as per RFC 4342, 8.5; * 1/0 is mapped into the maximum of 100%; * we want to avoid that p = 1/x is rounded down to 0 when x is very large, since this means accidentally re-entering slow-start (indicated by p==0). In the last case, the minimum-resolution value of p is returned. Furthermore, a bug in ccid3_hc_rx_getsockopt is fixed (1/0 was mapped into ~0U), which now allows to consistently print the scaled p-values as printf("Loss Event Rate = %u.%04u %%\n", rx_info.tfrcrx_p / 10000, rx_info.tfrcrx_p % 10000); Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp: Add packet type information to CCID-specific option parsingGerrit Renker
This patch ... 1. adds packet type information to ccid_hc_{rx,tx}_parse_options(). This is necessary, since table 3 in RFC 4340, 5.8 leaves it to the CCIDs to state which options may (not) appear on what packet type. 2. adds such a check for CCID-3's {Loss Event, Receive} Rate as specified in RFC 4340 8.3 ("Receive Rate options MUST NOT be sent on DCCP-Data packets") and 8.5 ("Loss Event Rate options MUST NOT be sent on DCCP-Data packets"). 3. removes an unused argument `idx' from ccid_hc_{rx,tx}_parse_options(). This is also no longer necessary, since the CCID-specific option-parsing routines are passed every single parameter of the type-length-value option encoding. Also added documentation and made argument naming scheme consistent. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp ccid-3: Simplify and consolidate tx_parse_optionsGerrit Renker
This simplifies and consolidates the TX option-parsing code: 1. The Loss Intervals option is not currently used, so dead code related to this option is removed. I am aware of no plans to support the option, but if someone wants to implement it (e.g. for inter-op tests), it is better to start afresh than having to also update currently unused code. 2. The Loss Event and Receive Rate options have a lot of code in common (both are 32 bit, both have same length etc.), so this is consolidated. 3. The test against GSR is not necessary, because - on first loading CCID3, ccid_new() zeroes out all fields in the socket; - ccid3_hc_tx_packet_recv() treats 0 and ~0U equivalently, due to pinv = opt_recv->ccid3or_loss_event_rate; if (pinv == ~0U || pinv == 0) hctx->p = 0; - as a result, the sequence number field is removed from opt_recv. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp ccid-3: Remove ugly RTT-sampling history lookupGerrit Renker
This removes the RTT-sampling function tfrc_tx_hist_rtt(), since 1. it suffered from complex passing of return values (the return value both indicated successful lookup while the value doubled as RTT sample); 2. when for some odd reason the sample value equalled 0, this triggered a bug warning about "bogus Ack", due to the ambiguity of the return value; 3. on a passive host which has not sent anything the TX history is empty and thus will lead to unwanted "bogus Ack" warnings such as ccid3_hc_tx_packet_recv: server(e7b7d518): DATAACK with bogus ACK-28197148 ccid3_hc_tx_packet_recv: server(e7b7d518): DATAACK with bogus ACK-26641606. The fix is to replace the implicit encoding by performing the steps manually. Furthermore, the "bogus Ack" warning has been removed, since it can actually be triggered due to several reasons (network reordering, old packet, (3) above), hence it is not very useful. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp ccid-3: Bug fix for the inter-packet scheduling algorithmGerrit Renker
This fixes a subtle bug in the calculation of the inter-packet gap and shows that t_delta, as it is currently used, is not needed. And hence replaced. The algorithm from RFC 3448, 4.6 below continually computes a send time t_nom, which is initialised with the current time t_now; t_gran = 1E6 / HZ specifies the scheduling granularity, s the packet size, and X the sending rate: t_distance = t_nom - t_now; // in microseconds t_delta = min(t_ipi, t_gran) / 2; // `delta' parameter in microseconds if (t_distance >= t_delta) { reschedule after (t_distance / 1000) milliseconds; } else { t_ipi = s / X; // inter-packet interval in usec t_nom += t_ipi; // compute the next send time send packet now; } 1) Description of the bug ------------------------- Rescheduling requires a conversion into milliseconds, due to this call chain: * ccid3_hc_tx_send_packet() returns a timeout in milliseconds, * this value is converted by msecs_to_jiffies() in dccp_write_xmit(), * and finally used as jiffy-expires-value for sk_reset_timer(). The highest jiffy resolution with HZ=1000 is 1 millisecond, so using a higher granularity does not make much sense here. As a consequence, values of t_distance < 1000 are truncated to 0. This issue has so far been resolved by using instead if (t_distance >= t_delta + 1000) reschedule after (t_distance / 1000) milliseconds; The bug is in artificially inflating t_delta to t_delta' = t_delta + 1000. This is unnecessarily large, a more adequate value is t_delta' = max(t_delta, 1000). 2) Consequences of using the corrected t_delta' ----------------------------------------------- Since t_delta <= t_gran/2 = 10^6/(2*HZ), we have t_delta <= 1000 as long as HZ >= 500. This means that t_delta' = max(1000, t_delta) is constant at 1000. On the other hand, when using a coarse HZ value of HZ < 500, we have three sub-cases that can all be reduced to using another constant of t_gran/2. (a) The first case arises when t_ipi > t_gran. Here t_delta' is the constant t_delta' = max(1000, t_gran/2) = t_gran/2. (b) If t_ipi <= 2000 < t_gran = 10^6/HZ usec, then t_delta = t_ipi/2 <= 1000, so that t_delta' = max(1000, t_delta) = 1000 < t_gran/2. (c) If 2000 < t_ipi <= t_gran, we have t_delta' = max(t_delta, 1000) = t_ipi/2. In the second and third cases we have delay values less than t_gran/2, which is in the order of less than or equal to half a jiffy. How these are treated depends on how fractions of a jiffy are handled: they are either always rounded down to 0, or always rounded up to 1 jiffy (assuming non-zero values). In both cases the error is on average in the order of 50%. Thus we are not increasing the error when in the second/third case we replace a value less than t_gran/2 with 0, by setting t_delta' to the constant t_gran/2. 3) Summary ---------- Fixing (1) and considering (2), the patch replaces t_delta with a constant, whose value depends on CONFIG_HZ, changing the above algorithm to: if (t_distance >= t_delta') reschedule after (t_distance / 1000) milliseconds; where t_delta' = 10^6/(2*HZ) if HZ < 500, and t_delta' = 1000 otherwise. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp ccid-3: No more CCID control blocks in LISTEN stateGerrit Renker
The CCIDs are activated as last of the features, at the end of the handshake, were the LISTEN state of the master socket is inherited into the server state of the child socket. Thus, the only states visible to CCIDs now are OPEN/PARTOPEN, and the closing states. This allows to remove tests which were previously necessary to protect against referencing a socket in the listening state (in CCID3), but which now have become redundant. As a further byproduct of enabling the CCIDs only after the connection has been fully established, several typecast-initialisations of ccid3_hc_{rx,tx}_sock can now be eliminated: * the CCID is loaded, so it is not necessary to test if it is NULL, * if it is possible to load a CCID and leave the private area NULL, then this is a bug, which should crash loudly - and earlier, * the test for state==OPEN || state==PARTOPEN now reduces only to the closing phase (e.g. when the node has received an unexpected Reset). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04dccp ccid-3: Remove ccid3hc{tx,rx}_ prefixesGerrit Renker
This patch does the same for CCID-3 as the previous patch for CCID-2: s#ccid3hctx_##g; s#ccid3hcrx_##g; plus manual editing to retain consistency. Please note: expanded the fields of the `struct tfrc_tx_info' in the hc_tx_sock, since using short #define identifiers is not a good idea. The only place where this embedded struct was used is ccid3_hc_tx_getsockopt(). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp: Toggle debug output without module unloadingGerrit Renker
This sets the sysfs permissions so that root can toggle the `debug' parameter available for nearly every DCCP module. This is useful since there are various module inter-dependencies. The debug flag can now be toggled at runtime using echo 1 > /sys/module/dccp/parameters/dccp_debug echo 1 > /sys/module/dccp_ccid2/parameters/ccid2_debug echo 1 > /sys/module/dccp_ccid3/parameters/ccid3_debug echo 1 > /sys/module/dccp_tfrc_lib/parameters/tfrc_debug The last is not very useful yet, since no code at the moment calls the tfrc_debug() macro. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-07-13dccp ccid-3: Fix a loss detection bugGerrit Renker
This fixes a bug in the logic of the TFRC loss detection: * new_loss_indicated() should not be called while a loss is pending; * but the code allows this; * thus, for two subsequent gaps in the sequence space, when loss_count has not yet reached NDUPACK=3, the loss_count is falsely reduced to 1. To avoid further and similar problems, all loss handling and loss detection is now done inside tfrc_rx_hist_handle_loss(), using an appropriate routine to track new losses. Further changes: ---------------- * added a reminder that no RX history operations should be performed when rx_handle_loss() has identified a (new) loss, since the function takes care of packet reordering during loss detection; * made tfrc_rx_hist_loss_pending() bool (thanks to an earlier suggestion by Arnaldo); * removed unused functions. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-07-13dccp: Upgrade NDP count from 3 to 6 bytesGerrit Renker
RFC 4340, 7.7 specifies up to 6 bytes for the NDP Count option, whereas the code is currently limited to up to 3 bytes. This seems to be a relict of an earlier draft version and is brought up to date by the patch. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-06-11dccp: Fix sparse warningsGerrit Renker
This patch fixes the following sparse warnings: * nested min(max()) expression: net/dccp/ccids/ccid3.c:91:21: warning: symbol '__x' shadows an earlier one net/dccp/ccids/ccid3.c:91:21: warning: symbol '__y' shadows an earlier one * Declaration of function prototypes in .c instead of .h file, resulting in "should it be static?" warnings. * Declared "struct dccpw" static (local to dccp_probe). * Disabled dccp_delayed_ack() - not fully removed due to RFC 4340, 11.3 ("Receivers SHOULD implement delayed acknowledgement timers ..."). * Used a different local variable name to avoid net/dccp/ackvec.c:293:13: warning: symbol 'state' shadows an earlier one net/dccp/ackvec.c:238:33: originally declared here * Removed unused functions `dccp_ackvector_print' and `dccp_ackvec_print'. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-06-11dccp ccid-3: Bug-Fix - Zero RTT is possibleGerrit Renker
In commit $(825de27d9e40b3117b29a79d412b7a4b78c5d815) (from 27th May, commit message `dccp ccid-3: Fix "t_ipi explosion" bug'), the CCID-3 window counter computation was fixed to cope with RTTs < 4 microseconds. Such RTTs can be found e.g. when running CCID-3 over loopback. The fix removed a check against RTT < 4, but introduced a divide-by-zero bug. All steady-state RTTs in DCCP are filtered using dccp_sample_rtt(), which ensures non-zero samples. However, a zero RTT is possible on initialisation, when there is no RTT sample from the Request/Response exchange. The fix is to use the fallback-RTT from RFC 4340, 3.4. This is also better than just fixing update_win_count() since it allows other parts of the code to always assume that the RTT is non-zero during the time that the CCID is used. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-05-27dccp ccid-3: Fix "t_ipi explosion" bugGerrit Renker
The identification of this bug is thanks to Cheng Wei and Tomasz Grobelny. To avoid divide-by-zero, the implementation previously ignored RTTs smaller than 4 microseconds when performing integer division RTT/4. When the RTT reached a value less than 4 microseconds (as observed on loopback), this prevented the Window Counter CCVal value from advancing. As a result, the receiver stopped sending feedback. This in turn caused non-ending expiries of the nofeedback timer at the sender, so that the sending rate was progressively reduced until reaching the minimum of one packet per 64 seconds. The patch fixes this bug by handling integer division more intelligently. Due to consistent use of dccp_sample_rtt(), divide-by-zero-RTT is avoided. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-02dccp: ccid2.c, ccid3.c use clamp(), clamp_t()Harvey Harrison
Makes the intention of the nested min/max clear. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: Kill some bloatIlpo Järvinen
Without a number of CONFIG.*DEBUG: net/dccp/ccids/ccid3.c: ccid3_hc_tx_update_x | -170 ccid3_hc_tx_packet_sent | -175 ccid3_hc_tx_packet_recv | -169 ccid3_hc_tx_no_feedback_timer | -192 ccid3_hc_tx_send_packet | -144 5 functions changed, 850 bytes removed, diff: -850 net/dccp/ccids/ccid3.c: ccid3_update_send_interval | +191 1 function changed, 191 bytes added, diff: +191 net/dccp/ccids/ccid3.o: 6 functions changed, 191 bytes added, 850 bytes removed, diff: -659 Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: Nofeedback timer according to rfc3448bisGerrit Renker
This implements the changes to the nofeedback timer handling suggested in draft rfc3448bis00, section 4.4. In particular, these changes mean: * better handling of the lossless case (p == 0) * the timestamp for computing t_ld becomes obsolete * much more recent document (RFC 3448 is almost 5 years old) * concepts in rfc3448bis arose from a real, working implementation (cf. sec. 12) Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: Implement rfc3448bis changes to feedback receptionGerrit Renker
This implements the algorithm to update the allowed sending rate X upon receiving feedback packets, as described in draft rfc3448bis, 4.2/4.3. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: Remove two irrelevant states in TX feedback handlingGerrit Renker
* the NO_SENT state is only triggered in bidirectional mode, costing unnecessary processing. * the TERM (terminating) state is irrelevant. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: Use a function to update p_inv, and p is never usedGerrit Renker
This patch 1) concentrates previously scattered computation of p_inv into one function; 2) removes the `p' element of the CCID3 RX sock (it is redundant); 3) makes the tfrc_rx_info structure standalone, only used on demand. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID]: More informative registrationGerrit Renker
The patch makes the registration messages of CCID 2/3 a bit more informative: instead of repeating the CCID number as currently done, "CCID: Registered CCID 2 (ccid2)" or "CCID: Registered CCID 3 (ccid3)", the descriptive names of the CCID's (from RFCs) are now used: "CCID: Registered CCID 2 (TCP-like)" and "CCID: Registered CCID 3 (TCP-Friendly Rate Control)". To allow spaces in the name, the slab name string has been changed to refer to the numeric CCID identifier, using the same format as before. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: Interface CCID3 code with newer Loss Intervals DatabaseGerrit Renker
This hooks up the TFRC Loss Interval database with CCID 3 packet reception. In addition, it makes the CCID-specific computation of the first loss interval (which requires access to all the guts of CCID3) local to ccid3.c. The patch also fixes an omission in the DCCP code, that of a default / fallback RTT value (defined in section 3.4 of RFC 4340 as 0.2 sec); while at it, the upper bound of 4 seconds for an RTT sample has been reduced to match the initial TCP RTO value of 3 seconds from[RFC 1122, 4.2.3.1]. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: Redundant debugging output / documentationGerrit Renker
Each time feedback is sent two lines are printed: ccid3_hc_rx_send_feedback: client ... - entry ccid3_hc_rx_send_feedback: Interval ...usec, X_recv=..., 1/p=... The first line is redundant and thus removed. Further, documentation of ccid3_hc_rx_sock (capitalisation) is made consistent. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: HC-receiver should not insert timestamps as HC-sender doesn't uses itGerrit Renker
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: New rx history codeArnaldo Carvalho de Melo
Credit here goes to Gerrit Renker, that provided the initial implementation for this new codebase. I modified it just to try to make it closer to the existing API, renaming some functions, add namespacing and fix one bug where the tfrc_rx_hist_alloc was not freeing the allocated ring entries on the error path. Original changeset comment from Gerrit: ----------- This provides a new, self-contained and generic RX history service for TFRC based protocols. Details: * new data structure, initialisation and cleanup routines; * allocation of dccp_rx_hist entries local to packet_history.c, as a service exported by the dccp_tfrc_lib module. * interface to automatically track highest-received seqno; * receiver-based RTT estimation (needed for instance by RFC 3448, 6.3.1); * a generic function to test for `data packets' as per RFC 4340, sec. 7.7. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: The receiver of a half-connection does not set window counter valuesGerrit Renker
Only the sender sets window counters [RFC 4342, sections 5 and 8.1]. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: Rename dccp_rx_ to tfrc_rx_Arnaldo Carvalho de Melo
This is in preparation for merging the new rx history code written by Gerrit Renker. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: Make the rx history slab be globalArnaldo Carvalho de Melo
This is in preparation for merging the new rx history code written by Gerrit Renker. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: Hide tx history details from the CCIDsArnaldo Carvalho de Melo
Based on a previous patch by Gerrit Renker. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[TFRC]: Migrate TX history to singly-linked lisArnaldo Carvalho de Melo
This patch was based on another made by Gerrit Renker, his changelog was: ------------------------------------------------------ The patch set migrates TFRC TX history to a singly-linked list. The details are: * use of a consistent naming scheme (all TFRC functions now begin with `tfrc_'); * allocation and cleanup are taken care of internally; * provision of a lookup function, which is used by the CCID TX infrastructure to determine the time a packet was sent (in turn used for RTT sampling); * integration of the new interface with the present use in CCID3. ------------------------------------------------------ Simplifications I did: . removing the tfrc_tx_hist_head that had a pointer to the list head and another for the slabcache. . No need for creating a slabcache for each CCID that wants to use the TFRC tx history routines, create a single slabcache when the dccp_tfrc_lib module init routine is called. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: Inline for moving averageGerrit Renker
The moving average computation occurs so frequently in the CCID 3 code that it merits an inline function of its own. This is uses a suggestion by Arnaldo as per http://www.mail-archive.com/dccp@vger.kernel.org/msg01662.html Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: Accurately determine idle & application-limited periodsGerrit Renker
This fixes/updates the handling of idle and application-limited periods in CCID3, which currently is broken: there is no detection as to how long a sender has been idle - there is only one flag which is toggled in between function calls. Being obsolete now, the `idle' flag is removed. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: Ignore trivial amounts of elapsed timeGerrit Renker
This patch fixes a previously undiscovered bug; the problem is in computing the elapsed time as the time between `receiving' the packet (i.e. skb enters CCID module) and sending feedback: - there is no layer-processing, queueing, or delay involved, - hence the elapsed time is in the order of 1 function call - this is in the dimension of maximally 50..100usec - which renders the use of elapsed time almost entirely useless. The fix is simply to ignore such trivial amounts of elapsed time. As a further advantage, the now useless elapsed_time field can be removed from the socket, which reduces the socket structure by another four bytes. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[CCID3]: Revert use of MSS instead of sGerrit Renker
This updates the CCID3 code with regard to two instances of using `MSS' in place of `s': 1. The RFC3390-based initial rate: both rfc3448bis as well as the Faster Restart draft now consistently use `s' instead of MSS. 2. Now agrees with section 4.2 of rfc3448bis: "If the sender is ready to send data when it does not yet have a round trip sample, the value of X is set to s bytes per second, for segment size s [...]" Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NET]: Convert init_timer into setup_timerPavel Emelyanov
Many-many code in the kernel initialized the timer->function and timer->data together with calling init_timer(timer). There is already a helper for this. Use it for networking code. The patch is HUGE, but makes the code 130 lines shorter (98 insertions(+), 228 deletions(-)). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-20[DCCP]: Spelling fixesJoe Perches
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-24[CCID2/3]: Initialisation assignments of 0 are redundantGerrit Renker
Assigning initial values of `0' is redundant when loading a new CCID structure, since in net/dccp/ccid.c the entire CCID structure is zeroed out prior to initialisation in ccid_new(): struct ccid { struct ccid_operations *ccid_ops; char ccid_priv[0]; }; // ... if (rx) { memset(ccid + 1, 0, ccid_ops->ccid_hc_rx_obj_size); if (ccid->ccid_ops->ccid_hc_rx_init != NULL && ccid->ccid_ops->ccid_hc_rx_init(ccid, sk) != 0) goto out_free_ccid; } else { memset(ccid + 1, 0, ccid_ops->ccid_hc_tx_obj_size); /* analogous to the rx case */ } This patch therefore removes the redundant assignments. Thanks to Arnaldo for the inspiration. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2007-10-24[DCCP]: Unaligned pointer accessGerrit Renker
This fixes `unaligned (read) access' errors of the type Kernel unaligned access at TPC[100f970c] dccp_parse_options+0x4f4/0x7e0 [dccp] Kernel unaligned access at TPC[1011f2e4] ccid3_hc_tx_parse_options+0x1ac/0x380 [dccp_ccid3] Kernel unaligned access at TPC[100f9898] dccp_parse_options+0x680/0x880 [dccp] by using the get_unaligned macro for parsing options. Commiter note: Preserved the sparse __be{16,32} annotations. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2007-10-10[DCCP]: Make all `debug' parameters boolGerrit Renker
This just sets the parameter to bool, since debugging messages are either on or off. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[CCID3]: Move NULL-protection into functionGerrit Renker
This moves several instances of testing against NULL into the function which is used to de-reference the CCID-private data. Committer note: Made the BUG_ON depend on having CONFIG_IP_DCCP_CCID3_DEBUG, as it is too much to have this on production code. Also made sure that the macro is used only after checking if sk_state is not LISTEN, to make it equivalent to what we had before. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
2007-10-10[DCCP]: Simplify interface of dccp_sample_rttGerrit Renker
The third parameter of dccp_sample_rtt now becomes useless and is removed. Also combined the subtraction of the timestamp echo and the elapsed time. This is safe, since (a) presence of timestamp echo is tested first and (b) elapsed time is either present and non-zero or it is not set and equals 0 due to the memset in dccp_parse_options. To avoid measuring option-processing time, the timestamp for measuring the initial Request/Response RTT sample is taken directly when the function is called (the Linux implementation always adds a timestamp on the Request, so there is no loss in doing this). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[DCCP]: Reuse ktime_get_real() calls againGerrit Renker
This patch reduces the number of timestamps taken in the receive path for each packet. The ccid3_hc_tx_update_x() routine is called in * the receive path for each CCID3-controlled packet * for the nofeedback timer (if no feedback arrives during 4 RTT) Currently, when there is no loss, each packet gets timestamped twice. The patch resolves this by recycling the first timestamp taken on packet reception for RTT sampling. When the no_feedback_timer() is called, then the timestamp argument is simply set to NULL - so that ccid3_hc_tx_update_x() takes care of the logic. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[DCCP] packet_history: Convert dccphtx_tstamp to ktime_tArnaldo Carvalho de Melo
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[DCCP] packet_history: convert dccphrx_tstamp to ktime_tArnaldo Carvalho de Melo
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[DCCP] CCID3: Stop using dccp_timestampArnaldo Carvalho de Melo
Now to convert the ackvec code to ktime_t so that we can get rid of dccp_timestamp and the epoch thing in dccp_sock. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[DCCP]: Convert dccp_sample_rtt to ktime_tArnaldo Carvalho de Melo
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10[DCCP]: Convert ccid3hcrx_tstamp_last_feedback to ktime_tArnaldo Carvalho de Melo
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>