aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2008-09-04dccp: Feature negotiation for minimum-checksum-coverageGerrit Renker
This provides feature negotiation for server minimum checksum coverage which so far has been missing. Since sender/receiver coverage values range only from 0...15, their type has also been reduced in size from u16 to u4. Feature-negotiation options are now generated for both sender and receiver coverage, i.e. when the peer has `forgotten' to enable partial coverage then feature negotiation will automatically enable (negotiate) the partial coverage value for this connection. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04dccp: Deprecate old setsockopt frameworkGerrit Renker
The previous setsockopt interface, which passed socket options via struct dccp_so_feat, is complicated/difficult to use. Continuing to support it leads to ugly code since the old approach did not distinguish between NN and SP values. This patch removes the old setsockopt interface and replaces it with two new functions to register NN/SP values for feature negotiation. These are essentially wrappers around the internal __feat_register functions, with checking added to avoid * wrong usage (type); * changing values while the connection is in progress. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp: Mechanism to resolve CCID dependenciesGerrit Renker
This adds a hook to resolve features whose value depends on the choice of CCID. It is done at the server since it can only be done after the CCID values have been negotiated; i.e. the client will add its CCID preference list on the Change options sent in the Request, which will be reconciled with the local preference list of the server. The concept is documented on http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/feature_negotiation/\ implementation_notes.html#ccid_dependencies Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04dccp: Resolve dependencies of features on choice of CCIDGerrit Renker
This provides a missing link in the code chain, as several features implicitly depend and/or rely on the choice of CCID. Most notably, this is the Send Ack Vector feature, but also Ack Ratio and Send Loss Event Rate (also taken care of). For Send Ack Vector, the situation is as follows: * since CCID2 mandates the use of Ack Vectors, there is no point in allowing endpoints which use CCID2 to disable Ack Vector features such a connection; * a peer with a TX CCID of CCID2 will always expect Ack Vectors, and a peer with a RX CCID of CCID2 must always send Ack Vectors (RFC 4341, sec. 4); * for all other CCIDs, the use of (Send) Ack Vector is optional and thus negotiable. However, this implies that the code negotiating the use of Ack Vectors also supports it (i.e. is able to supply and to either parse or ignore received Ack Vectors). Since this is not the case (CCID-3 has no Ack Vector support), the use of Ack Vectors is here disabled, with a comment in the source code. An analogous consideration arises for the Send Loss Event Rate feature, since the CCID-3 implementation does not support the loss interval options of RFC 4342. To make such use explicit, corresponding feature-negotiation options are inserted which signal the use of the loss event rate option, as it is used by the CCID3 code. Lastly, the values of the Ack Ratio feature are matched to the choice of CCID. The patch implements this as a function which is called after the user has made all other registrations for changing default values of features. The table is variable-length, the reserved (and hence for feature-negotiation invalid, confirmed by considering section 19.4 of RFC 4340) feature number `0' is used to mark the end of the table. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04dccp: Query supported CCIDsGerrit Renker
This provides a data structure to record which CCIDs are locally supported and three accessor functions: - a test function for internal use which is used to validate CCID requests made by the user; - a copy function so that the list can be used for feature-negotiation; - documented getsockopt() support so that the user can query capabilities. The data structure is a table which is filled in at compile-time with the list of available CCIDs (which in turn depends on the Kconfig choices). Using the copy function for cloning the list of supported CCIDs is useful for feature negotiation, since the negotiation is now with the full list of available CCIDs (e.g. {2, 3}) instead of the default value {2}. This means negotiation will not fail if the peer requests to use CCID3 instead of CCID2. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04dccp: Registration routines for changing feature valuesGerrit Renker
Two registration routines, for SP and NN features, are provided by this patch, replacing a previous routine which was used for both feature types. These are internal-only routines and therefore start with `__feat_register'. It further exports the known limits of Sequence Window and Ack Ratio as symbolic constants. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04dccp: Limit feature negotiation to connection setup phaseGerrit Renker
This patch starts the new implementation of feature negotiation: 1. Although it is theoretically possible to perform feature negotiation at any time (and RFC 4340 supports this), in practice this is prohibitively complex, as it requires to put traffic on hold for each new negotiation. 2. As a byproduct of restricting feature negotiation to connection setup, the feature-negotiation retransmit timer is no longer required. This part is now mapped onto the protocol-level retransmission. Details indicating why timers are no longer needed can be found on http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/feature_negotiation/\ implementation_notes.html This patch disables anytime negotiation, subsequent patches work out full feature negotiation support for connection setup. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp: Cleanup routines for feature negotiationGerrit Renker
This inserts the required de-allocation routines for memory allocated by feature negotiation in the socket destructors, replacing dccp_feat_clean() in one instance. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04dccp: Per-socket initialisation of feature negotiationGerrit Renker
This provides feature-negotiation initialisation for both DCCP sockets and DCCP request_sockets, to support feature negotiation during connection setup. It also resolves a FIXME regarding the congestion control initialisation. Thanks to Wei Yongjun for help with the IPv6 side of this patch. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04dccp: List management for new feature negotiationGerrit Renker
This adds list fields and list management functions for the new feature negotiation implementation. The new code is kept in parallel to the old code, until removed at the end of the patch set. Thanks to Arnaldo for suggestions to improve the code. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04dccp: Implement lookup table for feature-negotiation informationGerrit Renker
A lookup table for feature-negotiation information, extracted from RFC 4340/42, is provided by this patch. All currently known features can be found in this table, along with their feature location, their default value, and type. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04dccp: Basic data structure for feature negotiationGerrit Renker
This patch prepares for the new and extended feature-negotiation routines. The following feature-negotiation data structures are provided: * a container for the various (SP or NN) values, * symbolic state names to track feature states, * an entry struct which holds all current information together, * elementary functions to fill in and process these structures. Entry structs are arranged as FIFO for the following reason: RFC 4340 specifies that if multiple options of the same type are present, they are processed in the order of their appearance in the packet; which means that this order needs to be preserved in the local data structure (the later insertion code also respects this order). The struct list_head has been chosen for the following reasons: the most frequent operations are * add new entry at tail (when receiving Change or setting socket options); * delete entry (when Confirm has been received); * deep copy of entire list (cloning from listening socket onto request socket). The NN value has been set to 64 bit, which is a currently sufficient upper limit (Sequence Window feature has 48 bit). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
2008-09-04dccp ccid-3: Replace lazy BUG_ON with conditionGerrit Renker
The BUG_ON(w_tot == 0) only holds if there is no more than 1 loss interval in the loss history. If there is only a single loss interval, the calc_i_mean() routine need in fact not be called (RFC 3448, 6.3.1). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp: Toggle debug output without module unloadingGerrit Renker
This sets the sysfs permissions so that root can toggle the `debug' parameter available for nearly every DCCP module. This is useful since there are various module inter-dependencies. The debug flag can now be toggled at runtime using echo 1 > /sys/module/dccp/parameters/dccp_debug echo 1 > /sys/module/dccp_ccid2/parameters/ccid2_debug echo 1 > /sys/module/dccp_ccid3/parameters/ccid3_debug echo 1 > /sys/module/dccp_tfrc_lib/parameters/tfrc_debug The last is not very useful yet, since no code at the moment calls the tfrc_debug() macro. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp: Empty the write queue when disconnectingGerrit Renker
dccp_disconnect() can be called due to several reasons: 1. when the connection setup failed (inet_stream_connect()); 2. when shutting down (inet_shutdown(), inet_csk_listen_stop()); 3. when aborting the connection (dccp_close() with 0 linger time). In case (1) the write queue is empty. This patch empties the write queue, if in case (2) or (3) it was not yet empty. This avoids triggering the write-queue BUG_TRAP in sk_stream_kill_queues() later on. It also seems natural to do: when breaking an association, to delete all packets that were originally intended for the soon-disconnected end (compare with call to tcp_write_queue_purge in tcp_disconnect()). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp: Fill in the Data fields for "Option Error" ResetsGerrit Renker
This updates the use of the `out_invalid_option' label, which produces a Reset (code 5, "Option Error"), to fill in the Data1...Data3 fields as specified in RFC 4340, 5.6. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp: Silently ignore options with nonsensical lengthsGerrit Renker
This updates the option-parsing code with regard to RFC 4340, 5.8: "[..] options with nonsensical lengths (length byte less than two or more than the remaining space in the options portion of the header) MUST be ignored, and any option space following an option with nonsensical length MUST likewise be ignored." Hence in the following cases erratic options will be ignored: 1. The type byte of a multi-byte option is the last byte of the header options (i.e. effective option length of 1). 2. The value of the length byte is less than the minimum 2. This has been changed from previously 3: although no multi-byte option with a length less than 3 yet exists (cf. table 3 in 5.8), a length of 2 is valid. (The switch-statement in dccp_parse has further per-option length checks.) 3. The option length exceeds the length of the remaining option space. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04dccp: Always generate a Reset in response to option errorsWei Yongjun
RFC4340 states that if a packet is received with an option error (such as a Mandatory Option as the last byte of the option list), the endpoint should repond with a Reset. In the LISTEN and RESPOND states, the endpoint correctly reponds with Reset, while in the REQUEST/OPEN states, packets with option errors are just ignored. The packet sequence is as follows: Case 1: Endpoint A Endpoint B (CLOSED) (CLOSED) <---------------- REQUEST RESPONSE -----------------> (*1) (with invalid option) <---------------- RESET (with Reset Code 5, "Option Error") (*1) currently just ignored, no Reset is sent Case 2: Endpoint A Endpoint B (OPEN) (OPEN) DATA-ACK -----------------> (*2) (with invalid option) <---------------- RESET (with Reset Code 5, "Option Error") (*2) currently just ignored, no Reset is sent This patch fixes the problem, by generating a Reset instead of silently ignoring option errors. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-02tipc: Don't use structure names which easily globally conflict.David S. Miller
Andrew Morton reported a build failure on sparc32, because TIPC uses names like "struct node" and there is a like named data structure defined in linux/node.h This just regexp replaces "struct node*" to "struct tipc_node*" to avoid this and any future similar problems. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-02ipsec: Fix deadlock in xfrm_state management.David S. Miller
Ever since commit 4c563f7669c10a12354b72b518c2287ffc6ebfb3 ("[XFRM]: Speed up xfrm_policy and xfrm_state walking") it is illegal to call __xfrm_state_destroy (and thus xfrm_state_put()) with xfrm_state_lock held. If we do, we'll deadlock since we have the lock already and __xfrm_state_destroy() tries to take it again. Fix this by pushing the xfrm_state_put() calls after the lock is dropped. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-02ipv: Re-enable IP when MTU > 68Breno Leitao
Re-enable IP when the MTU gets back to a valid size. This patch just checks if the in_dev is NULL on a NETDEV_CHANGEMTU event and if MTU is valid (bigger than 68), then re-enable in_dev. Also a function that checks valid MTU size was created. Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-02net/xfrm: Use an IS_ERR test rather than a NULL testJulien Brunel
In case of error, the function xfrm_bundle_create returns an ERR pointer, but never returns a NULL pointer. So a NULL test that comes after an IS_ERR test should be deleted. The semantic match that finds this problem is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @match_bad_null_test@ expression x, E; statement S1,S2; @@ x = xfrm_bundle_create(...) ... when != x = E * if (x != NULL) S1 else S2 // </smpl> Signed-off-by: Julien Brunel <brunel@diku.dk> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-02mac80211: Fix debugfs union misuse and pointer corruptionJouni Malinen
debugfs union in struct ieee80211_sub_if_data is misused by including a common default_key dentry as a union member. This ends occupying the same memory area with the first dentry in other union members (structures; usually drop_unencrypted). Consequently, debugfs operations on default_key symlinks and drop_unencrypted entry are using the same dentry pointer even though they are supposed to be separate ones. This can lead to removing entries incorrectly or potentially leaving something behind since one of the dentry pointers gets lost. Fix this by moving the default_key dentry to a new struct (common_debugfs) that contains dentries (more to be added in future) that are shared by all vif types. The debugfs union must only be used for vif type-specific entries to avoid this type of pointer corruption. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-09-02net/wireless/Kconfig: clarify the description for CONFIG_WIRELESS_EXT_SYSFSFlorian Mickler
Current setup with hal and NetworkManager will fail to work without newest hal version with this config option disabled. Although this will solve itself by time, at the moment it is dishonest to say that we don't know any software that uses it, if there are many many people relying on old hal versions. Signed-off-by: Florian Mickler <florian@mickler.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-29pkt_sched: Fix locking of qdisc_root with qdisc_root_sleeping_lock()Jarek Poplawski
Use qdisc_root_sleeping_lock() instead of qdisc_root_lock() where appropriate. The only difference is while dev is deactivated, when currently we can use a sleeping qdisc with the lock of noop_qdisc. This shouldn't be dangerous since after deactivation root lock could be used only by gen_estimator code, but looks wrong anyway. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-29ipv6: When we droped a packet, we should return NET_RX_DROP instead of 0Yang Hongyang
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-27sctp: fix random memory dereference with SCTP_HMAC_IDENT option.Vlad Yasevich
The number of identifiers needs to be checked against the option length. Also, the identifier index provided needs to be verified to make sure that it doesn't exceed the bounds of the array. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-27sctp: correct bounds check in sctp_setsockopt_auth_keyVlad Yasevich
The bonds check to prevent buffer overlflow was not exactly right. It still allowed overflow of up to 8 bytes which is sizeof(struct sctp_authkey). Since optlen is already checked against the size of that struct, we are guaranteed not to cause interger overflow either. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-27Merge branch 'no-iwlwifi' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
2008-08-27ipv4: mode 0555 in ipv4_skeletonHugh Dickins
vpnc on today's kernel says Cannot open "/proc/sys/net/ipv4/route/flush": d--------- 0 root root 0 2008-08-26 11:32 /proc/sys/net/ipv4/route d--------- 0 root root 0 2008-08-26 19:16 /proc/sys/net/ipv4/neigh Signed-off-by: Hugh Dickins <hugh@veritas.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-27tcp: fix tcp header size miscalculation when window scale is unusedPhilip Love
The size of the TCP header is miscalculated when the window scale ends up being 0. Additionally, this can be induced by sending a SYN to a passive open port with a window scale option with value 0. Signed-off-by: Philip Love <love_phil@emc.com> Signed-off-by: Adam Langley <agl@imperialviolet.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-27pkt_sched: Fix gen_estimator locksJarek Poplawski
While passing a qdisc root lock to gen_new_estimator() and gen_replace_estimator() dev could be deactivated or even before grafting proper root qdisc as qdisc_sleeping (e.g. qdisc_create), so using qdisc_root_lock() is not enough. This patch adds qdisc_root_sleeping_lock() for this, plus additional checks, where necessary. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-27pkt_sched: Use rcu_assign_pointer() to change dev_queue->qdiscJarek Poplawski
These pointers are RCU protected, so proper primitives should be used. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-27pkt_sched: Fix dev_graft_qdisc() lockingJarek Poplawski
During dev_graft_qdisc() dev is deactivated, so qdisc_root_lock() returns wrong lock of noop_qdisc instead of qdisc_sleeping. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-26mac80211: quiet chatty IBSS merge messageJohn W. Linville
It seems obvious that this #ifndef should be the opposite polarity... Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-26mac80211: don't send empty extended rates IEJan-Espen Pettersen
The association request includes a list of supported data rates. 802.11b: 4 supported rates. 802.11g: 12 (8 + 4) supported rates. 802.11a: 8 supported rates. The rates tag of the assoc request has room for only 8 rates. In case of 802.11g an extended rate tag is appended. However in net/wireless/mlme.c an extended (empty) rate tag is also appended if the number of rates is exact 8. This empty (length=0) extended rates tag causes some APs to deny association with code 18 (unsupported rates). These APs include my ZyXEL G-570U, and according to Tomas Winkler som Cisco APs. 'If count == 8' has been used to check for the need for an extended rates tag. But count would also be equal to 8 if the for loop exited because of no more supported rates. Therefore a check for count being less than rates_len would seem more correct. Thanks to: * Dan Williams for newbie guidance * Tomas Winkler for confirming the problem Signed-off-by: Jan-Espen Pettersen <sigsegv@radiotube.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-26mac80211: Fix debugfs file add/del for netdevJouni Malinen
Previous version was using incorrect union structures for non-AP interfaces when adding and removing max_ratectrl_rateidx and force_unicast_rateidx entries. Depending on the vif type, this ended up in corrupting debugfs entries since the dentries inside different union structures ended up going being on top of eachother.. As the end result, debugfs files were being left behind with references to freed data (instant kernel oops on access) and directories were not removed properly when unloading mac80211 drivers. This patch fixes those issues by using only a single union structure based on the vif type. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-26net/mac80211/mesh.c: correct the argument to __mesh_table_freeJulia Lawall
In the function mesh_table_grow, it is the new table not the argument table that should be freed if the function fails (cf commit bd9b448f4c0a514559bdae4ca18ca3e8cd999c6d) The semantic match that detects this problem is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @r exists@ local idexpression x; expression E,f; position p1,p2,p3; identifier l; statement S; @@ x = mesh_table_alloc@p1(...) ... if (x == NULL) S ... when != E = x when != mesh_table_free(x) goto@p2 l; ... when != E = x when != f(...,x,...) when any ( return \(0\|x\); | return@p3 ...; ) @script:python@ p1 << r.p1; p2 << r.p2; p3 << r.p3; @@ print "%s: call on line %s not freed or saved before return on line %s via line %s" % (p1[0].file,p1[0].line,p3[0].line,p2[0].line) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-26mac80211: Use IWEVASSOCREQIE instead of IWEVCUSTOMJouni Malinen
The previous code was using IWEVCUSTOM to report IEs from AssocReq and AssocResp frames into user space. This can easily hit the 256 byte limit (IW_CUSTOM_MAX) with APs that include number of vendor IEs in AssocResp. This results in the event message not being sent and dmesg showing "wlan0 (WE) : Wireless Event too big (366)" type of errors. Convert mac80211 to use IWEVASSOCREQIE/IWEVASSOCRESPIE to avoid the issue of being unable to send association IEs as wireless events. These newer event types use binary encoding and larger maximum size (IW_GENERIC_IE_MAX = 1024), so the likelyhood of not being able to send the IEs is much smaller than with IWEVCUSTOM. As an extra benefit, the code is also quite a bit simpler since there is no need to allocate an extra buffer for hex encoding. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-26net: rfkill: add missing line breakFelipe Balbi
Trivial patch adding a missing line break on rfkill_claim_show(). Signed-off-by: Felipe Balbi <felipe.balbi@nokia.com> Acked-by: Ivo van Doorn <IvDoorn@gmail.co> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-25ipv6: sysctl fixesAl Viro
Braino: net.ipv6 in ipv6 skeleton has no business in rotable class Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-25ipv4: sysctl fixesAl Viro
net.ipv4.neigh should be a part of skeleton to avoid ordering problems Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-25sctp: add verification checks to SCTP_AUTH_KEY optionVlad Yasevich
The structure used for SCTP_AUTH_KEY option contains a length that needs to be verfied to prevent buffer overflow conditions. Spoted by Eugene Teo <eteo@redhat.com>. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-23ipv6: protocol for address routesStephen Hemminger
This fixes a problem spotted with zebra, but not sure if it is necessary a kernel problem. With IPV6 when an address is added to an interface, Zebra creates a duplicate RIB entry, one as a connected route, and other as a kernel route. When an address is added to an interface the RTN_NEWADDR message causes Zebra to create a connected route. In IPV4 when an address is added to an interface a RTN_NEWROUTE message is set to user space with the protocol RTPROT_KERNEL. Zebra ignores these messages, because it already has the connected route. The problem is that route created in IPV6 has route protocol == RTPROT_BOOT. Was this a design decision or a bug? This fixes it. Same patch applies to both net-2.6 and stable. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-23icmp: icmp_sk() should not use smp_processor_id() in preemptible codeDenis V. Lunev
Pass namespace into icmp_xmit_lock, obtain socket inside and return it as a result for caller. Thanks Alexey Dobryan for this report: Steps to reproduce: CONFIG_PREEMPT=y CONFIG_DEBUG_PREEMPT=y tracepath <something> BUG: using smp_processor_id() in preemptible [00000000] code: tracepath/3205 caller is icmp_sk+0x15/0x30 Pid: 3205, comm: tracepath Not tainted 2.6.27-rc4 #1 Call Trace: [<ffffffff8031af14>] debug_smp_processor_id+0xe4/0xf0 [<ffffffff80409405>] icmp_sk+0x15/0x30 [<ffffffff8040a17b>] icmp_send+0x4b/0x3f0 [<ffffffff8025a415>] ? trace_hardirqs_on_caller+0xd5/0x160 [<ffffffff8025a4ad>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff8023a475>] ? local_bh_enable_ip+0x95/0x110 [<ffffffff804285b9>] ? _spin_unlock_bh+0x39/0x40 [<ffffffff8025a26c>] ? mark_held_locks+0x4c/0x90 [<ffffffff8025a4ad>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff8025a415>] ? trace_hardirqs_on_caller+0xd5/0x160 [<ffffffff803e91b4>] ip_fragment+0x8d4/0x900 [<ffffffff803e7030>] ? ip_finish_output2+0x0/0x290 [<ffffffff803e91e0>] ? ip_finish_output+0x0/0x60 [<ffffffff803e6650>] ? dst_output+0x0/0x10 [<ffffffff803e922c>] ip_finish_output+0x4c/0x60 [<ffffffff803e92e3>] ip_output+0xa3/0xf0 [<ffffffff803e68d0>] ip_local_out+0x20/0x30 [<ffffffff803e753f>] ip_push_pending_frames+0x27f/0x400 [<ffffffff80406313>] udp_push_pending_frames+0x233/0x3d0 [<ffffffff804067d1>] udp_sendmsg+0x321/0x6f0 [<ffffffff8040d155>] inet_sendmsg+0x45/0x80 [<ffffffff803b967f>] sock_sendmsg+0xdf/0x110 [<ffffffff8024a100>] ? autoremove_wake_function+0x0/0x40 [<ffffffff80257ce5>] ? validate_chain+0x415/0x1010 [<ffffffff8027dc10>] ? __do_fault+0x140/0x450 [<ffffffff802597d0>] ? __lock_acquire+0x260/0x590 [<ffffffff803b9e55>] ? sockfd_lookup_light+0x45/0x80 [<ffffffff803ba50a>] sys_sendto+0xea/0x120 [<ffffffff80428e42>] ? _spin_unlock_irqrestore+0x42/0x80 [<ffffffff803134bc>] ? __up_read+0x4c/0xb0 [<ffffffff8024e0c6>] ? up_read+0x26/0x30 [<ffffffff8020b8bb>] system_call_fastpath+0x16/0x1b icmp6_sk() is similar. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-22pkt_sched: Fix qdisc list lockingJarek Poplawski
Since some qdiscs call qdisc_tree_decrease_qlen() (so qdisc_lookup()) without rtnl_lock(), adding and deleting from a qdisc list needs additional locking. This patch adds global spinlock qdisc_list_lock and wrapper functions for modifying the list. It is considered as a temporary solution until hfsc_dequeue(), netem_dequeue() and tbf_dequeue() (or qdisc_tree_decrease_qlen()) are redone. With feedback from Herbert Xu and David S. Miller. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-21pkt_sched: Fix qdisc_watchdog() vs. dev_deactivate() raceJarek Poplawski
dev_deactivate() can skip rescheduling of a qdisc by qdisc_watchdog() or other timer calling netif_schedule() after dev_queue_deactivate(). We prevent this checking aliveness before scheduling the timer. Since during deactivation the root qdisc is available only as qdisc_sleeping additional accessor qdisc_root_sleeping() is created. With feedback from Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-21sctp: fix potential panics in the SCTP-AUTH API.Vlad Yasevich
All of the SCTP-AUTH socket options could cause a panic if the extension is disabled and the API is envoked. Additionally, there were some additional assumptions that certain pointers would always be valid which may not always be the case. This patch hardens the API and address all of the crash scenarios. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-19pkt_sched: Prevent livelock in TX queue running.David S. Miller
If dev_deactivate() is trying to quiesce the queue, it is theoretically possible for another cpu to livelock trying to process that queue. This happens because dev_deactivate() grabs the queue spinlock as it checks the queue state, whereas net_tx_action() does a trylock and reschedules the qdisc if it hits the lock. This breaks the livelock by adding a check on __QDISC_STATE_DEACTIVATED to net_tx_action() when the trylock fails. Based upon feedback from Herbert Xu and Jarek Poplawski. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-19Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6