aboutsummaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2009-11-18linkwatch: linkwatch_forget_dev() to speedup device dismantleEric Dumazet
Herbert Xu a écrit : > On Tue, Nov 17, 2009 at 04:26:04AM -0800, David Miller wrote: >> Really, the link watch stuff is just due for a redesign. I don't >> think a simple hack is going to cut it this time, sorry Eric :-) > > I have no objections against any redesigns, but since the only > caller of linkwatch_forget_dev runs in process context with the > RTNL, it could also legally emit those events. Thanks guys, here an updated version then, before linkwatch surgery ? In this version, I force the event to be sent synchronously. [PATCH net-next-2.6] linkwatch: linkwatch_forget_dev() to speedup device dismantle time ip link del eth3.103 ; time ip link del eth3.104 ; time ip link del eth3.105 real 0m0.266s user 0m0.000s sys 0m0.001s real 0m0.770s user 0m0.000s sys 0m0.000s real 0m1.022s user 0m0.000s sys 0m0.000s One problem of current schem in vlan dismantle phase is the holding of device done by following chain : vlan_dev_stop() -> netif_carrier_off(dev) -> linkwatch_fire_event(dev) -> dev_hold() ... And __linkwatch_run_queue() runs up to one second later... A generic fix to this problem is to add a linkwatch_forget_dev() method to unlink the device from the list of watched devices. dev->link_watch_next becomes dev->link_watch_list (and use a bit more memory), to be able to unlink device in O(1). After patch : time ip link del eth3.103 ; time ip link del eth3.104 ; time ip link del eth3.105 real 0m0.024s user 0m0.000s sys 0m0.000s real 0m0.032s user 0m0.000s sys 0m0.001s real 0m0.033s user 0m0.000s sys 0m0.000s Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-18net: introduce NETDEV_UNREGISTER_PERNETOctavian Purdila
This new event is called once for each unique net namespace in batched unregister operations (with the argument set to a random device from that namespace) and once per device in non-batched unregister operations. It allows us to factorize some device unregister work such as clearing the routing cache. Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-17net: add dev_txq_stats_fold() helperEric Dumazet
Some drivers ndo_get_stats() method need to perform txqueue stats folding. Move folding from dev_get_stats() to a new dev_txq_stats_fold() function Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-17Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/can/Kconfig
2009-11-15Revert "isdn: isdn_ppp: Use SKB list facilities instead of home-grown ↵David S. Miller
implementation." This reverts commit 38783e671399b5405f1fd177d602c400a9577ae6. It causes kernel bugzilla #14594 Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-15remove deprecated and not used: print_mac()Marin Mitov
The function print_mac in net/ethernet/eth.c is marked __deprecated and not used. Remove it. Signed-off-by: Marin Mitov <mitov@issp.bas.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-15net: Optimize hard_start_xmit() return checkingJarek Poplawski
Recent changes in the TX error propagation require additional checking and masking of values returned from hard_start_xmit(), mainly to separate cases where skb was consumed. This aim can be simplified by changing the order of NETDEV_TX and NET_XMIT codes, because the latter are treated similarly to negative (ERRNO) values. After this change much simpler dev_xmit_complete() is also used in sch_direct_xmit(), so it is moved to netdevice.h. Additionally NET_RX definitions in netdevice.h are moved up from between TX codes to avoid confusion while reading the TX comment. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-13ipv6: use RCU to walk list of network devicesEric Dumazet
No longer need read_lock(&dev_base_lock), use RCU instead. We also can avoid taking references on inet6_dev structs. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-13net: TCP_MSS_DEFAULT, TCP_MSS_DESIREDWilliam Allen Simpson
Define two symbols needed in both kernel and user space. Remove old (somewhat incorrect) kernel variant that wasn't used in most cases. Default should apply to both RMSS and SMSS (RFC2581). Replace numeric constants with defined symbols. Stand-alone patch, originally developed for TCPCT. Signed-off-by: William.Allen.Simpson@gmail.com Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-13net: allow to propagate errors through ->ndo_hard_start_xmit()Patrick McHardy
Currently the ->ndo_hard_start_xmit() callbacks are only permitted to return one of the NETDEV_TX codes. This prevents any kind of error propagation for virtual devices, like queue congestion of the underlying device in case of layered devices, or unreachability in case of tunnels. This patches changes the NET_XMIT codes to avoid clashes with the NETDEV_TX codes and changes the two callers of dev_hard_start_xmit() to expect either errno codes, NET_XMIT codes or NETDEV_TX codes as return value. In case of qdisc_restart(), all non NETDEV_TX codes are mapped to NETDEV_TX_OK since no error propagation is possible when using qdiscs. In case of dev_queue_xmit(), the error is propagated upwards. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-10netdev: add netdev_continue_rcustephen hemminger
This adds an RCU macro for continuing search, useful for some network devices like vlan. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-10usbnet: Set link down initially for drivers that update link stateBen Hutchings
Some usbnet drivers update link state while others do not due to hardware limitations. Add a flag to distinguish those that do, and set the link down initially for their devices. This is intended to fix this bug: http://bugs.debian.org/444043 Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-10udp: bind() optimisationEric Dumazet
UDP bind() can be O(N^2) in some pathological cases. Thanks to secondary hash tables, we can make it O(N) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-09Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
2009-11-08net/compat_ioctl: support SIOCWANDEVArnd Bergmann
This adds compat_ioctl support for SIOCWANDEV, which has always been missing. The definition of struct compat_ifreq was missing an ifru_settings fields that is needed to support SIOCWANDEV, so add that and clean up the whitespace damage in the struct definition. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-08udp: secondary hash on (local port, local address)Eric Dumazet
Extends udp_table to contain a secondary hash table. socket anchor for this second hash is free, because UDP doesnt use skc_bind_node : We define an union to hold both skc_bind_node & a new hlist_nulls_node udp_portaddr_node udp_lib_get_port() inserts sockets into second hash chain (additional cost of one atomic op) udp_lib_unhash() deletes socket from second hash chain (additional cost of one atomic op) Note : No spinlock lockdep annotation is needed, because lock for the secondary hash chain is always get after lock for primary hash chain. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-08udp: split sk_hash into two u16 hashesEric Dumazet
Union sk_hash with two u16 hashes for udp (no extra memory taken) One 16 bits hash on (local port) value (the previous udp 'hash') One 16 bits hash on (local address, local port) values, initialized but not yet used. This second hash is using jenkin hash for better distribution. Because the 'port' is xored later, a partial hash is performed on local address + net_hash_mix(net) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-08can: Driver for the Microchip MCP251x SPI CAN controllersChristian Pellegrin
Signed-off-by: Christian Pellegrin <chripell@fsfe.org> Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-07net: kill proto_ops wrapperArnd Bergmann
All users of wrapped proto_ops are now gone, so we can safely remove the wrappers as well. Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-06net: compat: No need to define IFHWADDRLEN and IFNAMSIZ twice.David S. Miller
It's defined colloqually in linux/if.h and linux/compat.h includes that. Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-06compat: add struct compat_ifreq etc to compat.hArnd Bergmann
In order to move socket ioctl conversion code into multiple places in the socket code, we need a common defintion of the data structures it uses. Also change the name from ifreq32 to compat_ifreq to follow the naming convention for compat.h Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-06Merge branch 'for-next' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/lowpan/lowpan
2009-11-06Merge branch 'linux-2.6.33.y' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/inaky/wimax
2009-11-06ieee802154: add support for creation/removal of logic interfacesDmitry Eremin-Solenikov
Add support for two more NL802154 commands: ADD_IFACE and DEL_IFACE, thus allowing creation and removal of logic WPAN interfaces on the top of wpan-phy. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2009-11-06ieee802154: add LIST_PHY command supportDmitry Eremin-Solenikov
Add nl802154 command to get information about PHY's present in the system. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2009-11-06Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/usb/cdc_ether.c All CDC ethernet devices of type USB_CLASS_COMM need to use '&mbm_info'. Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-05net: pass kern to net_proto_family create functionEric Paris
The generic __sock_create function has a kern argument which allows the security system to make decisions based on if a socket is being created by the kernel or by userspace. This patch passes that flag to the net_proto_family specific create function, so it can do the same thing. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-05net: drop capability from protocol definitionsEric Paris
struct can_proto had a capability field which wasn't ever used. It is dropped entirely. struct inet_protosw had a capability field which can be more clearly expressed in the code by just checking if sock->type = SOCK_RAW. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-04net: cleanup include/linuxEric Dumazet
This cleanup patch puts struct/union/enum opening braces, in first line to ease grep games. struct something { becomes : struct something { Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-04net: Introduce for_each_netdev_rcu() iteratorEric Dumazet
Adds RCU management to the list of netdevices. Convert some for_each_netdev() users to RCU version, if it can avoid read_lock-ing dev_base_lock Ie: read_lock(&dev_base_loack); for_each_netdev(net, dev) some_action(); read_unlock(&dev_base_lock); becomes : rcu_read_lock(); for_each_netdev_rcu(net, dev) some_action(); rcu_read_unlock(); Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-02gianfar: Basic Support for programming hash rulesSandeep Gopalpet
This patch provides basic hash rules programming via the ethtool interface. Signed-off-by: Sandeep Gopalpet <Sandeep.Kumar@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-02tg3 / broadcom: Optionally disable TXC if no linkMatt Carlson
This patch adds code to disable the TXC and RXC reference clocks if link is not available. Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-02tg3 / broadcom: Add code to disable rxc refclkMatt Carlson
The 5785 does not use the RXC reference clock. Turning it off is desirable as it saves power. By default, the 50610 enables the RXC reference clock and the 50610M disables it. Presumably this is one of the reasons why the hardware architect chose one over the other. Adding a "rx reference clock disable" flag is not the ideal way to describe the option, as it would force the MAC using a 50610M to set the flag. Ideally we want the flags to represent opt-in behavior that deviates from hardware defaults. Furthermore, the lack of a "disable" flag implies that the requester wants the rx reference clock enabled, which doesn't necessarily follow. By presenting the option as a passive statement (rx reference clock unused) rather than a command, I hope to convey an opt-in option to disable the rx reference clock that falls back to hardware defaults if not set. A secondary benefit of this is that it keeps the intelligence about phy defaults in the broadcom module where it belongs and allows the broadcom module more latitude should a bug arise. Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-02tg3 / broadcom: Add PHY_BRCM_CLEAR_RGMII_MODE flagMatt Carlson
Broadcom 50610M parts changed the default definitions of the RGMII mode shadow register. The 5785 needs the RGMII mode selection bits [4:3] cleared. The default value of the remaining bits in this register are zero. Rather than unnecessarily burn an extra bit in the dev_flags member in an attempt to enumerate all possible combinations, this patch take a more course grained approach and labels the option as "clear all bits". Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-02broadcom: Consolidate dev_flags definitionsMatt Carlson
This patch moves all the dev_flags enumerations outside the broadcom.c file to include/linux/brcmphy.h. The existing flags were not used yet and have been re-enumerated to avoid conflicts. Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-01net: Introduce dev_get_by_name_rcu()Eric Dumazet
Some workloads hit dev_base_lock rwlock pretty hard. We can use RCU lookups to avoid touching this rwlock (and avoid touching netdevice refcount) netdevices are already freed after a RCU grace period, so this patch adds no penalty at device dismantle time. However, it adds a synchronize_rcu() call in dev_change_name() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-30RDS: Add GET_MR_FOR_DEST sockoptAndy Grover
RDS currently supports a GET_MR sockopt to establish a memory region (MR) for a chunk of memory. However, the fastreg method ties a MR to a particular destination. The GET_MR_FOR_DEST sockopt allows the remote machine to be specified, and thus support for fastreg (aka FRWRs). Note that this patch does *not* do all of this - it simply implements the new sockopt in terms of the old one, so applications can begin to use the new sockopt in preparation for cutover to FRWRs. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-30mac80211: also drop qos-nullfunc frames silentlyJohannes Berg
We drop nullfunc frames, but not qos-nullfunc frames, even though those could be used for PS state control as well. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-10-30net: Allow devices to specify a device specific sysfs group.Eric W. Biederman
This isn't beautifully abstracted, but it is simple, simplifies uses and so far is only needed for the bonding driver. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-30net: fix sk_forward_alloc corruptionEric Dumazet
On UDP sockets, we must call skb_free_datagram() with socket locked, or risk sk_forward_alloc corruption. This requirement is not respected in SUNRPC. Add a convenient helper, skb_free_datagram_locked() and use it in SUNRPC Reported-by: Francis Moreau <francis.moro@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29net: fix kmemcheck annotationsEric Dumazet
struct sk_buff kmemcheck annotations enlarged this structure by 8/16 bytes Fix this by moving 'protocol' inside flags1 bitfield, and queue_mapping inside flags2 bitfield. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29gro: Change all receive functions to return GRO result codesBen Hutchings
This will allow drivers to adjust their receive path dynamically based on whether GRO is being applied successfully. Currently all in-tree callers ignore the return values of these functions and do not need to be changed. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29gro: Name the GRO result enumeration typeBen Hutchings
This clarifies which return and parameter types are GRO result codes and not RX result codes. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29net,socket: introduce DECLARE_SOCKADDR helper to catch overflow at build timeCyrill Gorcunov
proto_ops->getname implies copying protocol specific data into storage unit (particulary to __kernel_sockaddr_storage). So when we implement new protocol support we should keep such a detail in mind (which is easy to forget about). Lets introduce DECLARE_SOCKADDR helper which check if storage unit is not overfowed at build time. Eventually inet_getname is switched to use DECLARE_SOCKADDR (to show example of usage). Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
2009-10-29net: Introduce dev_get_by_index_rcu()Eric Dumazet
Some workloads hit dev_base_lock rwlock pretty hard. We can use RCU lookups to avoid touching this rwlock. netdevices are already freed after a RCU grace period, so this patch adds no penalty at device dismantle time. dev_ifname() converted to dev_get_by_index_rcu() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29Allow disabling of DSACK TCP option per routeGilad Ben-Yossef
Add and use no DSCAK bit in the features field. Signed-off-by: Gilad Ben-Yossef <gilad@codefidence.com> Sigend-off-by: Ori Finkelman <ori@comsleep.com> Sigend-off-by: Yony Amit <yony@comsleep.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29Allow to turn off TCP window scale opt per routeGilad Ben-Yossef
Add and use no window scale bit in the features field. Note that this is not the same as setting a window scale of 0 as would happen with window limit on route. Signed-off-by: Gilad Ben-Yossef <gilad@codefidence.com> Sigend-off-by: Ori Finkelman <ori@comsleep.com> Sigend-off-by: Yony Amit <yony@comsleep.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29Allow disabling TCP timestamp options per routeGilad Ben-Yossef
Implement querying and acting upon the no timestamp bit in the feature field. Signed-off-by: Gilad Ben-Yossef <gilad@codefidence.com> Sigend-off-by: Ori Finkelman <ori@comsleep.com> Sigend-off-by: Yony Amit <yony@comsleep.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29Add the no SACK route option featureGilad Ben-Yossef
Implement querying and acting upon the no sack bit in the features field. Signed-off-by: Gilad Ben-Yossef <gilad@codefidence.com> Sigend-off-by: Ori Finkelman <ori@comsleep.com> Sigend-off-by: Yony Amit <yony@comsleep.com> Signed-off-by: David S. Miller <davem@davemloft.net>