aboutsummaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2009-03-13ppp: ppp_mp_explode() redesignGabriele Paoloni
I found the PPP subsystem to not work properly when connecting channels with different speeds to the same bundle. Problem Description: As the "ppp_mp_explode" function fragments the sk_buff buffer evenly among the PPP channels that are connected to a certain PPP unit to make up a bundle, if we are transmitting using an upper layer protocol that requires an Ack before sending the next packet (like TCP/IP for example), we will have a bandwidth bottleneck on the slowest channel of the bundle. Let's clarify by an example. Let's consider a scenario where we have two PPP links making up a bundle: a slow link (10KB/sec) and a fast link (1000KB/sec) working at the best (full bandwidth). On the top we have a TCP/IP stack sending a 1000 Bytes sk_buff buffer down to the PPP subsystem. The "ppp_mp_explode" function will divide the buffer in two fragments of 500B each (we are neglecting all the headers, crc, flags etc?.). Before the TCP/IP stack sends out the next buffer, it will have to wait for the ACK response from the remote peer, so it will have to wait for both fragments to have been sent over the two PPP links, received by the remote peer and reconstructed. The resulting behaviour is that, rather than having a bundle working @1010KB/sec (the sum of the channels bandwidths), we'll have a bundle working @20KB/sec (the double of the slowest channels bandwidth). Problem Solution: The problem has been solved by redesigning the "ppp_mp_explode" function in such a way to make it split the sk_buff buffer according to the speeds of the underlying PPP channels (the speeds of the serial interfaces respectively attached to the PPP channels). Referring to the above example, the redesigned "ppp_mp_explode" function will now divide the 1000 Bytes buffer into two fragments whose sizes are set according to the speeds of the channels where they are going to be sent on (e.g . 10 Byets on 10KB/sec channel and 990 Bytes on 1000KB/sec channel). The reworked function grants the same performances of the original one in optimal working conditions (i.e. a bundle made up of PPP links all working at the same speed), while greatly improving performances on the bundles made up of channels working at different speeds. Signed-off-by: Gabriele Paoloni <gabriele.paoloni@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-13phylib: convert state_queue work to delayed_workMarcin Slusarz
It closes a race in phy_stop_machine when reprogramming of phy_timer (from phy_state_machine) happens between del_timer_sync and cancel_work_sync. Without this change it could lead to crash if phy_device would be freed after phy_stop_machine (timer would fire and schedule freed work). Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Acked-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-13Network Drop Monitor: Adding Build changes to enable drop monitorNeil Horman
Network Drop Monitor: Adding Build changes to enable drop monitor Signed-off-by: Neil Horman <nhorman@tuxdriver.com> include/linux/Kbuild | 1 + net/Kconfig | 11 +++++++++++ net/core/Makefile | 1 + 3 files changed, 13 insertions(+) Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-13Network Drop Monitor: Adding drop monitor implementation & Netlink protocolNeil Horman
Signed-off-by: Neil Horman <nhorman@tuxdriver.com> include/linux/net_dropmon.h | 56 +++++++++ net/core/drop_monitor.c | 263 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 319 insertions(+) Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-13Network Drop Monitor: Adding kfree_skb_clean for non-drops and modifying ↵Neil Horman
end-of-line points for skbs Signed-off-by: Neil Horman <nhorman@tuxdriver.com> include/linux/skbuff.h | 4 +++- net/core/datagram.c | 2 +- net/core/skbuff.c | 22 ++++++++++++++++++++++ net/ipv4/arp.c | 2 +- net/ipv4/udp.c | 2 +- net/packet/af_packet.c | 2 +- 6 files changed, 29 insertions(+), 5 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-05ssb: Add SPROM fallback supportMichael Buesch
This adds SSB functionality to register a fallback SPROM image from the architecture setup code. Weird architectures exist that have half-assed SSB devices without SPROM attached to their PCI busses. The architecture can register a fallback SPROM image that is used if no SPROM is found on the SSB device. Signed-off-by: Michael Buesch <mb@bu3sch.de> Cc: Florian Fainelli <florian@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-03-05Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/tokenring/tmspci.c drivers/net/ucc_geth_mii.c
2009-03-04Merge branch 'master' of /home/davem/src/GIT/linux-2.6/David S. Miller
2009-03-04vlan: Fix vlan-in-vlan crashes.David S. Miller
As analyzed by Patrick McHardy, vlan needs to reset it's netdev_ops pointer in it's ->init() function but this leaves the compat method pointers stale. Add a netdev_resync_ops() and call it from the vlan code. Any other driver which changes ->netdev_ops after register_netdevice() will need to call this new function after doing so too. With help from Patrick McHardy. Tested-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-04neigh: Allow for user space users of the neighbour tableEric Biederman
Currently it is possible to do just about everything with the arp table from user space except treat an entry like you are using it. To that end implement and a flag NTF_USE that when set in a netwlink update request treats the neighbour table entry like the kernel does on the output path. This allows user space applications to share the kernel's arp cache. Signed-off-by: Eric Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-03Merge branch 'sched-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: don't allow setuid to succeed if the user does not have rt bandwidth sched_rt: don't start timer when rt bandwidth disabled
2009-03-03Merge branch 'core-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: rcu: Teach RCU that idle task is not quiscent state at boot
2009-03-02Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: fix warning in io_mapping_map_wc() x86: i915 needs pgprot_writecombine() and is_io_mapping_possible()
2009-03-02skbuff.h: fix timestamps kernel-docRandy Dunlap
Fix skbuff.h kernel-doc for timestamps: must include "struct" keyword, otherwise there are kernel-doc errors: Error(linux-next-20090227//include/linux/skbuff.h:161): cannot understand prototype: 'struct skb_shared_hwtstamps ' Error(linux-next-20090227//include/linux/skbuff.h:177): cannot understand prototype: 'union skb_shared_tx ' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-02wimax/i2400m: implement RX reorder supportInaky Perez-Gonzalez
Allow the device to give the driver RX data with reorder information. When that is done, the device will indicate the driver if a packet has to be held in a (sorted) queue. It will also tell the driver when held packets have to be released to the OS. This is done to improve the WiMAX-protocol level retransmission support when missing frames are detected. The code docs provide details about the implementation. In general, this just hooks into the RX path in rx.c; if a packet with the reorder bit in the RX header is detected, the reorder information in the header is extracted and one of the four main reorder operations are executed. In one case (queue) no packet will be delivered to the networking stack, just queued, whereas in the others (reset, update_ws and queue_update_ws), queued packet might be delivered depending on the window start for the specific queue. The modifications to files other than rx.c are: - control.c: during device initialization, enable reordering support if the rx_reorder_disabled module parameter is not enabled - driver.c: expose a rx_reorder_disable module parameter and call i2400m_rx_setup/release() to initialize/shutdown RX reorder support. - i2400m.h: introduce members in 'struct i2400m' needed for implementing reorder support. - linux/i2400m.h: introduce TLVs, commands and constant definitions related to RX reorder Last but not least, the rx reorder code includes an small circular log where the last N reorder operations are recorded to be displayed in case of inconsistency. Otherwise diagnosing issues would be almost impossible. Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-02wimax/i2400m: support extended data RX protocol (no need to reallocate skbs)Inaky Perez-Gonzalez
Newer i2400m firmwares (>= v1.4) extend the data RX protocol so that each packet has a 16 byte header. This header is mainly used to implement host reordeing (which is addressed in later commits). However, this header also allows us to overwrite it (once data has been extracted) with an Ethernet header and deliver to the networking stack without having to reallocate the skb (as it happened in fw <= v1.3) to make room for it. - control.c: indicate the device [dev_initialize()] that the driver wants to use the extended data RX protocol. Also involves adding the definition of the needed data types in include/linux/wimax/i2400m.h. - rx.c: handle the new payload type for the extended RX data protocol. Prepares the skb for delivery to netdev.c:i2400m_net_erx(). - netdev.c: Introduce i2400m_net_erx() that adds the fake ethernet address to a prepared skb and delivers it to the networking stack. - cleanup: in most instances in rx.c, the variable 'single' was renamed to 'single_last' for it better conveys its meaning. Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-02wimax: struct device - replace bus_id with dev_name(), dev_set_name()Kay Sievers
Cc: inaky.perez-gonzalez@intel.com Cc: linux-wimax@intel.com Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-02wimax/i2400m: allow control of the base-station idle mode timeoutInaky Perez-Gonzalez
For power saving reasons, WiMAX links can be put in idle mode while connected after a certain time of the link not being used for tx or rx. In this mode, the device pages the base-station regularly and when data is ready to be transmitted, the link is revived. This patch allows the user to control the time the device has to be idle before it decides to go to idle mode from a sysfs interace. It also updates the initialization code to acknowledge the module variable 'idle_mode_disabled' when the firmware is a newer version (upcoming 1.4 vs 2.6.29's v1.3). The method for setting the idle mode timeout in the older firmwares is much more limited and can be only done at initialization time. Thus, the sysfs file will return -ENOSYS on older ones. Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-02tcp: kill eff_sacks "cache", the sole user can calculate itselfIlpo Järvinen
Also fixes insignificant bug that would cause sending of stale SACK block (would occur in some corner cases). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-02fix warning in io_mapping_map_wc()Pallipadi, Venkatesh
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-01Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/iwlwifi/iwl-tx.c net/8021q/vlan_core.c net/core/dev.c
2009-03-01net headers: export dcbnl.hChris Leech
The DCB netlink interface is required for building the userspace tools available at e1000.sourceforge.net Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-01net headers: cleanup dcbnl.hChris Leech
1) add an include for <linux/types.h> 2) change dcbmsg.dcb_family from unsigned char to __u8 to be more consistent with use of kernel types Signed-off-by: Chris Leech <christopher.leech@intel.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-28Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
2009-02-28Merge branch 'master' of /home/davem/src/GIT/linux-2.6/David S. Miller
2009-02-27Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: enable DMAR by default xen: disable interrupts early, as start_kernel expects gpu/drm, x86, PAT: io_mapping_create_wc and resource_size_t gpu/drm, x86, PAT: Handle io_mapping_create_wc() errors in a clean way x86, Voyager: fix compile by lifting the degeneracy of phys_cpu_present_map x86, doc: fix references to Documentation/x86/i386/boot.txt
2009-02-27Fix recursive lock in free_uid()/free_user_ns()David Howells
free_uid() and free_user_ns() are corecursive when CONFIG_USER_SCHED=n, but free_user_ns() is called from free_uid() by way of uid_hash_remove(), which requires uidhash_lock to be held. free_user_ns() then calls free_uid() to complete the destruction. Fix this by deferring the destruction of the user_namespace. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-27nl80211: Provide access to STA TX/RX packet countersJouni Malinen
The TX/RX packet counters are needed to fill in RADIUS Accounting attributes Acct-Output-Packets and Acct-Input-Packets. We already collect the needed information, but only the TX/RX bytes were previously exposed through nl80211. Allow applications to fetch the packet counters, too, to provide more complete support for accounting. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-02-27sched: don't allow setuid to succeed if the user does not have rt bandwidthDhaval Giani
Impact: fix hung task with certain (non-default) rt-limit settings Corey Hickey reported that on using setuid to change the uid of a rt process, the process would be unkillable and not be running. This is because there was no rt runtime for that user group. Add in a check to see if a user can attach an rt task to its task group. On failure, return EINVAL, which is also returned in CONFIG_CGROUP_SCHED. Reported-by: Corey Hickey <bugfood-ml@fatooh.org> Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26RDS: Add userspace headerAndy Grover
Applications include this header in order to use RDS sockets. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26RDS: Add AF and PF #defines for RDS socketsAndy Grover
RDS is a reliable datagram protocol used for IPC on Oracle database clusters. This adds address and protocol family numbers for it. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26block: reduce stack footprint of blk_recount_segments()Jens Axboe
blk_recalc_rq_segments() requires a request structure passed in, which we don't have from blk_recount_segments(). So the latter allocates one on the stack, using > 400 bytes of stack for that. This can cause us to spill over one page of stack from ext4 at least: 0) 4560 400 blk_recount_segments+0x43/0x62 1) 4160 32 bio_phys_segments+0x1c/0x24 2) 4128 32 blk_rq_bio_prep+0x2a/0xf9 3) 4096 32 init_request_from_bio+0xf9/0xfe 4) 4064 112 __make_request+0x33c/0x3f6 5) 3952 144 generic_make_request+0x2d1/0x321 6) 3808 64 submit_bio+0xb9/0xc3 7) 3744 48 submit_bh+0xea/0x10e 8) 3696 368 ext4_mb_init_cache+0x257/0xa6a [ext4] 9) 3328 288 ext4_mb_regular_allocator+0x421/0xcd9 [ext4] 10) 3040 160 ext4_mb_new_blocks+0x211/0x4b4 [ext4] 11) 2880 336 ext4_ext_get_blocks+0xb61/0xd45 [ext4] 12) 2544 96 ext4_get_blocks_wrap+0xf2/0x200 [ext4] 13) 2448 80 ext4_da_get_block_write+0x6e/0x16b [ext4] 14) 2368 352 mpage_da_map_blocks+0x7e/0x4b3 [ext4] 15) 2016 352 ext4_da_writepages+0x2ce/0x43c [ext4] 16) 1664 32 do_writepages+0x2d/0x3c 17) 1632 144 __writeback_single_inode+0x162/0x2cd 18) 1488 96 generic_sync_sb_inodes+0x1e3/0x32b 19) 1392 16 sync_sb_inodes+0xe/0x10 20) 1376 48 writeback_inodes+0x69/0xb3 21) 1328 208 balance_dirty_pages_ratelimited_nr+0x187/0x2f9 22) 1120 224 generic_file_buffered_write+0x1d4/0x2c4 23) 896 176 __generic_file_aio_write_nolock+0x35f/0x393 24) 720 80 generic_file_aio_write+0x6c/0xc8 25) 640 80 ext4_file_write+0xa9/0x137 [ext4] 26) 560 320 do_sync_write+0xf0/0x137 27) 240 48 vfs_write+0xb3/0x13c 28) 192 64 sys_write+0x4c/0x74 29) 128 128 system_call_fastpath+0x16/0x1b Split the segment counting out into a __blk_recalc_rq_segments() helper to avoid allocating an onstack request just for checking the physical segment count. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-02-26rcu: Teach RCU that idle task is not quiscent state at bootPaul E. McKenney
This patch fixes a bug located by Vegard Nossum with the aid of kmemcheck, updated based on review comments from Nick Piggin, Ingo Molnar, and Andrew Morton. And cleans up the variable-name and function-name language. ;-) The boot CPU runs in the context of its idle thread during boot-up. During this time, idle_cpu(0) will always return nonzero, which will fool Classic and Hierarchical RCU into deciding that a large chunk of the boot-up sequence is a big long quiescent state. This in turn causes RCU to prematurely end grace periods during this time. This patch changes the rcutree.c and rcuclassic.c rcu_check_callbacks() function to ignore the idle task as a quiescent state until the system has started up the scheduler in rest_init(), introducing a new non-API function rcu_idle_now_means_idle() to inform RCU of this transition. RCU maintains an internal rcu_idle_cpu_truthful variable to track this state, which is then used by rcu_check_callback() to determine if it should believe idle_cpu(). Because this patch has the effect of disallowing RCU grace periods during long stretches of the boot-up sequence, this patch also introduces Josh Triplett's UP-only optimization that makes synchronize_rcu() be a no-op if num_online_cpus() returns 1. This allows boot-time code that calls synchronize_rcu() to proceed normally. Note, however, that RCU callbacks registered by call_rcu() will likely queue up until later in the boot sequence. Although rcuclassic and rcutree can also use this same optimization after boot completes, rcupreempt must restrict its use of this optimization to the portion of the boot sequence before the scheduler starts up, given that an rcupreempt RCU read-side critical section may be preeempted. In addition, this patch takes Nick Piggin's suggestion to make the system_state global variable be __read_mostly. Changes since v4: o Changes the name of the introduced function and variable to be less emotional. ;-) Changes since v3: o WARN_ON(nr_context_switches() > 0) to verify that RCU switches out of boot-time mode before the first context switch, as suggested by Nick Piggin. Changes since v2: o Created rcu_blocking_is_gp() internal-to-RCU API that determines whether a call to synchronize_rcu() is itself a grace period. o The definition of rcu_blocking_is_gp() for rcuclassic and rcutree checks to see if but a single CPU is online. o The definition of rcu_blocking_is_gp() for rcupreempt checks to see both if but a single CPU is online and if the system is still in early boot. This allows rcupreempt to again work correctly if running on a single CPU after booting is complete. o Added check to rcupreempt's synchronize_sched() for there being but one online CPU. Tested all three variants both SMP and !SMP, booted fine, passed a short rcutorture test on both x86 and Power. Located-by: Vegard Nossum <vegard.nossum@gmail.com> Tested-by: Vegard Nossum <vegard.nossum@gmail.com> Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25ide: fix refcounting in device driversBartlomiej Zolnierkiewicz
During host driver module removal del_gendisk() results in a final put on drive->gendev and freeing the drive by drive_release_dev(). Convert device drivers from using struct kref to use struct device so device driver's object holds reference on ->gendev and prevents drive from prematurely going away. Also fix ->remove methods to not erroneously drop reference on a host driver by using only put_device() instead of ide*_put(). Reported-by: Stanislaw Gruszka <stf_xl@wp.pl> Tested-by: Stanislaw Gruszka <stf_xl@wp.pl> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-02-25Merge git://git.infradead.org/iommu-2.6Linus Torvalds
* git://git.infradead.org/iommu-2.6: intel-iommu: fix endless "Unknown DMAR structure type" loop VT-d: handle Invalidation Queue Error to avoid system hang intel-iommu: fix build error with INTR_REMAP=y and DMAR=n
2009-02-25gpu/drm, x86, PAT: io_mapping_create_wc and resource_size_tVenkatesh Pallipadi
io_mapping_create_wc should take a resource_size_t parameter in place of unsigned long. With unsigned long, there will be no way to map greater than 4GB address in i386/32 bit. On x86, greater than 4GB addresses cannot be mapped on i386 without PAE. Return error for such a case. Patch also adds a structure for io_mapping, that saves the base, size and type on HAVE_ATOMIC_IOMAP archs, that can be used to verify the offset on io_mapping_map calls. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Eric Anholt <eric@anholt.net> Cc: Keith Packard <keithp@keithp.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/orinoco/orinoco.c
2009-02-24netlink: change nlmsg_notify() return value logicPablo Neira Ayuso
This patch changes the return value of nlmsg_notify() as follows: If NETLINK_BROADCAST_ERROR is set by any of the listeners and an error in the delivery happened, return the broadcast error; else if there are no listeners apart from the socket that requested a change with the echo flag, return the result of the unicast notification. Thus, with this patch, the unicast notification is handled in the same way of a broadcast listener that has set the NETLINK_BROADCAST_ERROR socket flag. This patch is useful in case that the caller of nlmsg_notify() wants to know the result of the delivery of a netlink notification (including the broadcast delivery) and take any action in case that the delivery failed. For example, ctnetlink can drop packets if the event delivery failed to provide reliable logging and state-synchronization at the cost of dropping packets. This patch also modifies the rtnetlink code to ignore the return value of rtnl_notify() in all callers. The function rtnl_notify() (before this patch) returned the error of the unicast notification which makes rtnl_set_sk_err() reports errors to all listeners. This is not of any help since the origin of the change (the socket that requested the echoing) notices the ENOBUFS error if the notification fails and should resync itself. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-24Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6
2009-02-24i2c-dev: Clarify the unit of ioctl I2C_TIMEOUTJean Delvare
The unit in which user-space can set the bus timeout value is jiffies for historical reasons (back when HZ was always 100.) This is however not good because user-space doesn't know how long a jiffy lasts. The timeout value should instead be set in a fixed time unit. Given the original value of HZ, this unit should be 10 ms, for compatibility. Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Wolfram Sang <w.sang@pengutronix.de>
2009-02-24Merge branch 'master' of /home/davem/src/GIT/linux-2.6/David S. Miller
2009-02-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: netns: fix double free at netns creation veth : add the set_mac_address capability sunlance: Beyond ARRAY_SIZE of ib->btx_ring sungem: another error printed one too early ISDN: fix sc/shmem printk format warning SMSC: timeout reaches -1 smsc9420: handle magic field of ethtool_eeprom sundance: missing parentheses? smsc9420: fix another postfixed timeout wimax/i2400m: driver loads firmware v1.4 instead of v1.3 vlan: Update skb->mac_header in __vlan_put_tag(). cxgb3: Add support for PCI ID 0x35. tcp: remove obsoleted comment about different passes TG3: &&/|| confusion ATM: misplaced parentheses? net/mv643xx: don't disable the mib timer too early and lock properly net/mv643xx: use GFP_ATOMIC while atomic atl1c: Atheros L1C Gigabit Ethernet driver net: Kill skb_truesize_check(), it only catches false-positives. net: forcedeth: Fix wake-on-lan regression
2009-02-22PM: Split up sysdev_[suspend|resume] from device_power_[down|up]Rafael J. Wysocki
Move the sysdev_suspend/resume from the callee to the callers, with no real change in semantics, so that we can rework the disabling of interrupts during suspend/hibernation. This is based on an earlier patch from Linus. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-21Merge branch 'hibernate'Linus Torvalds
* hibernate: PM: Fix suspend_console and resume_console to use only one semaphore PM: Wait for console in resume PM: Fix pm_notifiers during user mode hibernation swsusp: clean up shrink_all_zones() swsusp: dont fiddle with swappiness PM: fix build for CONFIG_PM unset PM/hibernate: fix "swap breaks after hibernation failures" PM/resume: wait for device probing to finish Consolidate driver_probe_done() loops into one place
2009-02-21Consolidate driver_probe_done() loops into one placeArjan van de Ven
there's a few places that currently loop over driver_probe_done(), and I'm about to add another one. This patch abstracts it into a helper to reduce duplication. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Len Brown <lenb@kernel.org> Acked-by: Greg KH <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-208250: fix boot hang with serial console when using with Serial Over Lan portMauro Carvalho Chehab
Intel 8257x Ethernet boards have a feature called Serial Over Lan. This feature works by emulating a serial port, and it is detected by kernel as a normal 8250 port. However, this emulation is not perfect, as also noticed on changeset 7500b1f602aad75901774a67a687ee985d85893f. Before this patch, the kernel were trying to check if the serial TX is capable of work using IRQ's. This were done with a code similar this: serial_outp(up, UART_IER, UART_IER_THRI); lsr = serial_in(up, UART_LSR); iir = serial_in(up, UART_IIR); serial_outp(up, UART_IER, 0); if (lsr & UART_LSR_TEMT && iir & UART_IIR_NO_INT) up->bugs |= UART_BUG_TXEN; This works fine for other 8250 ports, but, on 8250-emulated SoL port, the chip is a little lazy to down UART_IIR_NO_INT at UART_IIR register. Due to that, UART_BUG_TXEN is sometimes enabled. However, as TX IRQ keeps working, and the TX polling is now enabled, the driver miss-interprets the IRQ received later, hanging up the machine until a key is pressed at the serial console. This is the 6 version of this patch. Previous versions were trying to introduce a large enough delay between serial_outp and serial_in(up, UART_IIR), but not taking forever. However, the needed delay couldn't be safely determined. At the experimental tests, a delay of 1us solves most of the cases, but still hangs sometimes. Increasing the delay to 5us was better, but still doesn't solve. A very high delay of 50 ms seemed to work every time. However, poking around with delays and pray for it to be enough doesn't seem to be a good approach, even for a quirk. So, instead of playing with random large arbitrary delays, let's just disable UART_BUG_TXEN for all SoL ports. [akpm@linux-foundation.org: fix warnings] Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-20spi_bitbang: add more lowlevel function documentationMichael Buesch
This adds more documentation of the lowlevel API to avoid future bugs. Signed-off-by: Michael Buesch <mb@bu3sch.de> Acked-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-20slab: introduce kzfree()Johannes Weiner
kzfree() is a wrapper for kfree() that additionally zeroes the underlying memory before releasing it to the slab allocator. Currently there is code which memset()s the memory region of an object before releasing it back to the slab allocator to make sure security-sensitive data are really zeroed out after use. These callsites can then just use kzfree() which saves some code, makes users greppable and allows for a stupid destructor that isn't necessarily aware of the actual object size. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Matt Mackall <mpm@selenic.com> Acked-by: Christoph Lameter <cl@linux-foundation.org> Cc: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-20netlink: add NETLINK_BROADCAST_ERROR socket optionPablo Neira Ayuso
This patch adds NETLINK_BROADCAST_ERROR which is a netlink socket option that the listener can set to make netlink_broadcast() return errors in the delivery to the caller. This option is useful if the caller of netlink_broadcast() do something with the result of the message delivery, like in ctnetlink where it drops a network packet if the event delivery failed, this is used to enable reliable logging and state-synchronization. If this socket option is not set, netlink_broadcast() only reports ESRCH errors and silently ignore ENOBUFS errors, which is what most netlink_broadcast() callers should do. This socket option is based on a suggestion from Patrick McHardy. Patrick McHardy can exchange this patch for a beer from me ;). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-20ethtool: Add RX pkt classification interfaceSantwona Behera
Signed-off-by: Santwona Behera <santwona.behera@sun.com> Signed-off-by: David S. Miller <davem@davemloft.net>