aboutsummaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2006-10-28[PATCH] taskstats: kill ->taskstats_lock in favor of ->siglockOleg Nesterov
signal_struct is (mostly) protected by ->sighand->siglock, I think we don't need ->taskstats_lock to protect ->stats. This also allows us to simplify the locking in fill_tgid(). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-28[PATCH] taskstats_tgid_alloc: optimizationOleg Nesterov
Every subthread (except first) does unneeded kmem_cache_alloc/kmem_cache_free. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-28[PATCH] taskstats_tgid_free: fix usageOleg Nesterov
taskstats_tgid_free() is called on copy_process's error path. This is wrong. IF (clone_flags & CLONE_THREAD) We should not clear ->signal->taskstats, current uses it, it probably has a valid accumulated info. ELSE taskstats_tgid_init() set ->signal->taskstats = NULL, there is nothing to free. Move the callsite to __exit_signal(). We don't need any locking, entire thread group is exiting, nobody should have a reference to soon to be released ->signal. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-28[PATCH] Constify compat_get_bitmap argumentStephen Rothwell
This means we can call it when the bitmap we want to fetch is declared const. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Christoph Lameter <clameter@engr.sgi.com> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-28[PATCH] __vmalloc with GFP_ATOMIC causes 'sleeping from invalid context'Giridhar Pemmasani
If __vmalloc is called to allocate memory with GFP_ATOMIC in atomic context, the chain of calls results in __get_vm_area_node allocating memory for vm_struct with GFP_KERNEL, causing the 'sleeping from invalid context' warning. This patch fixes it by passing the gfp flags along so __get_vm_area_node allocates memory for vm_struct with the same flags. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-28[PATCH] vmscan: Fix temp_priority raceMartin Bligh
The temp_priority field in zone is racy, as we can walk through a reclaim path, and just before we copy it into prev_priority, it can be overwritten (say with DEF_PRIORITY) by another reclaimer. The same bug is contained in both try_to_free_pages and balance_pgdat, but it is fixed slightly differently. In balance_pgdat, we keep a separate priority record per zone in a local array. In try_to_free_pages there is no need to do this, as the priority level is the same for all zones that we reclaim from. Impact of this bug is that temp_priority is copied into prev_priority, and setting this artificially high causes reclaimers to set distress artificially low. They then fail to reclaim mapped pages, when they are, in fact, under severe memory pressure (their priority may be as low as 0). This causes the OOM killer to fire incorrectly. From: Andrew Morton <akpm@osdl.org> __zone_reclaim() isn't modifying zone->prev_priority. But zone->prev_priority is used in the decision whether or not to bring mapped pages onto the inactive list. Hence there's a risk here that __zone_reclaim() will fail because zone->prev_priority ir large (ie: low urgency) and lots of mapped pages end up stuck on the active list. Fix that up by decreasing (ie making more urgent) zone->prev_priority as __zone_reclaim() scans the zone's pages. This bug perhaps explains why ZONE_RECLAIM_PRIORITY was created. It should be possible to remove that now, and to just start out at DEF_PRIORITY? Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <clameter@engr.sgi.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-28[PATCH] mm: clean up pagecache allocationNick Piggin
- Consolidate page_cache_alloc - Fix splice: only the pagecache pages and filesystem data need to use mapping_gfp_mask. - Fix grab_cache_page_nowait: same as splice, also honour NUMA placement. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-27[PATCH] drivers: wait for threaded probes between initcall levelsAndrew Morton
The multithreaded-probing code has a problem: after one initcall level (eg, core_initcall) has been processed, we will then start processing the next level (postcore_initcall) while the kernel threads which are handling core_initcall are still executing. This breaks the guarantees which the layered initcalls previously gave us. IOW, we want to be multithreaded _within_ an initcall level, but not between different levels. Fix that up by causing the probing code to wait for all outstanding probes at one level to complete before we start processing the next level. Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-23[PATCH] Remove __must_check for device_for_each_child()Russell King
Eliminate more __must_check madness. The return code from device_for_each_child() depends on the values which the helper function returns. If the helper function always returns zero, it's utterly pointless to check the return code from device_for_each_child(). The only code which knows if the return value should be checked is the caller itself, so forcing the return code to always be checked is silly. Hence, remove the __must_check annotation. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-21Merge branch 'upstream-linus' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: [PATCH] libata-sff: Allow for wacky systems [PATCH] ahci: readability tweak [PATCH] libata: typo fix [PATCH] ATA must depend on BLOCK [PATCH] libata: use correct map_db values for ICH8
2006-10-21Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6Linus Torvalds
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: [PATCH] x86-64: Revert timer routing behaviour back to 2.6.16 state [PATCH] x86-64: Overlapping program headers in physical addr space fix [PATCH] x86-64: Put more than one cpu in TARGET_CPUS [PATCH] x86: Revert new unwind kernel stack termination [PATCH] x86-64: Use irq_domain in ioapic_retrigger_irq [PATCH] i386: Disable nmi watchdog on all ThinkPads [PATCH] x86-64: Revert interrupt backlink changes [PATCH] x86-64: Fix ENOSYS in system call tracing [PATCH] i386: Fix fake return address [PATCH] x86-64: x86_64 add NX mask for PTE entry [PATCH] x86-64: Speed up dwarf2 unwinder [PATCH] x86: Use -maccumulate-outgoing-args [PATCH] x86-64: fix page align in e820 allocator [PATCH] x86-64: Fix for arch/x86_64/pci/Makefile CFLAGS [PATCH] i386: fix .cfi_signal_frame copy-n-paste error [PATCH] x86-64: typo in __assign_irq_vector when updating pos for vector and offset [PATCH] x86-64: x86_64 hot-add memory srat.c fix [PATCH] i386: Update defconfig [PATCH] x86-64: Update defconfig
2006-10-21[PATCH] cpuset: mempolicy migration typo fixPaul Jackson
Mistyped an ifdef CONFIG_CPUSETS - fixed. I doubt that anyone ever noticed. The impact of this typo was that if someone: 1) was using MPOL_BIND to force off node allocations 2) while using cpusets to constrain memory placement 3) when that cpuset was migrating that jobs memory 4) while the tasks in that job were actively forking then there was a rare chance that future allocations using that MPOL_BIND policy would be node local, not off node. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-21[PATCH] Reintroduce NODES_SPAN_OTHER_NODES for powerpcAndy Whitcroft
Reintroduce NODES_SPAN_OTHER_NODES for powerpc Revert "[PATCH] Remove SPAN_OTHER_NODES config definition" This reverts commit f62859bb6871c5e4a8e591c60befc8caaf54db8c. Revert "[PATCH] mm: remove arch independent NODES_SPAN_OTHER_NODES" This reverts commit a94b3ab7eab4edcc9b2cb474b188f774c331adf7. Also update the comments to indicate that this is still required and where its used. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Kravetz <kravetz@us.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Mel Gorman <mel@csn.ul.ie> Acked-by: Will Schmidt <will_schmidt@vnet.ibm.com> Cc: Christoph Lameter <clameter@sgi.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-21[PATCH] pci: declare pci_get_device_reverse()Andrew Morton
We seem to have lost the declaration of pci_get_device_reverse(), if we ever had one. Add a CONFIG_PCI=0 stub too. Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-21[PATCH] md: endian annotations for the bitmap superblockNeilBrown
And a couple of bug fixes found by sparse. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-21[PATCH] md: endian annotation for v1 superblock accessNeilBrown
Includes a couple of bugfixes found by sparse. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-21[PATCH] md: add another COMPAT_IOCTL for mdNeilBrown
.. so that you can use bitmaps with 32bit userspace on a 64 bit kernel. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-21[PATCH] libata: typo fixTejun Heo
Typo fix in commment. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-10-21Merge branch 'splice' of git://brick.kernel.dk/data/git/linux-2.6-blockLinus Torvalds
* 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block: [PATCH] Remove SUID when splicing into an inode [PATCH] Add lockless helpers for remove_suid() [PATCH] Introduce generic_file_splice_write_nolock() [PATCH] Take i_mutex in splice_from_pipe()
2006-10-21[PATCH] i386: Disable nmi watchdog on all ThinkPadsAndi Kleen
Even newer Thinkpads have bugs in SMM code that causes hangs with NMI watchdog. Signed-off-by: Andi Kleen <ak@suse.de>
2006-10-21[PATCH] x86-64: Speed up dwarf2 unwinderJan Beulich
This changes the dwarf2 unwinder to do a binary search for CIEs instead of a linear work. The linker is unfortunately not able to build a proper lookup table at link time, instead it creates one at runtime as soon as the bootmem allocator is usable (so you'll continue using the linear lookup for the first [hopefully] few calls). The code should be ready to utilize a build-time created table once a fixed linker becomes available. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de>
2006-10-20Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (36 commits) [Bluetooth] Fix HID disconnect NULL pointer dereference [Bluetooth] Add missing entry for Nokia DTL-4 PCMCIA card [Bluetooth] Add support for newer ANYCOM USB dongles [NET]: Can use __get_cpu_var() instead of per_cpu() in loopback driver. [IPV4] inet_peer: Group together avl_left, avl_right, v4daddr to speedup lookups on some CPUS [TCP]: One NET_INC_STATS() could be NET_INC_STATS_BH in tcp_v4_err() [NETFILTER]: Missing check for CAP_NET_ADMIN in iptables compat layer [NETPOLL]: initialize skb for UDP [IPV6]: Fix route.c warnings when multiple tables are disabled. [TG3]: Bump driver version and release date. [TG3]: Add lower bound checks for tx ring size. [TG3]: Fix set ring params tx ring size implementation [NET]: reduce per cpu ram used for loopback stats [IPv6] route: Fix prohibit and blackhole routing decision [DECNET]: Fix input routing bug [TCP]: Bound TSO defer time [IPv4] fib: Remove unused fib_config members [IPV6]: Always copy rt->u.dst.error when copying a rt6_info. [IPV6]: Make IPV6_SUBTREES depend on IPV6_MULTIPLE_TABLES. [IPV6]: Clean up BACKTRACK(). ...
2006-10-20[PATCH] nfsd: nfs_replay_meAl Viro
We are using NFS_REPLAY_ME as a special error value that is never leaked to clients. That works fine; the only problem is mixing host- and network- endian values in the same objects. Network-endian equivalent would work just as fine; switch to it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20[PATCH] nfsd: misc endianness annotationsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20[PATCH] nfsd: nfs4 code returns error values in net-endianAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20[PATCH] nfsd: vfs.c endianness annotationsAl Viro
don't use the same variable to store NFS and host error values Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20[PATCH] xdr annotations: NFSv4 serverAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20[PATCH] xdr annotations: NFSv3 serverAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20[PATCH] xdr annotations: NFSv2 serverAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20[PATCH] nfsfh simple endianness annotationsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20[PATCH] nfsd: nfserrno() endianness annotationsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20[PATCH] nfs: verifier is network-endianAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20[PATCH] xdr annotations: NFS readdir entriesAl Viro
on-the-wire data is big-endian [in large part pulled from Alexey's patch] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20[PATCH] lockd endianness annotationsAl Viro
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20[PATCH] fix svc_procfunc declarationAl Viro
svc_procfunc instances return __be32, not int Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20[PATCH] NFS: Deal with failure of invalidate_inode_pages2()Trond Myklebust
If invalidate_inode_pages2() fails, then it should in principle just be because the current process was signalled. In that case, we just want to ensure that the inode's page cache remains marked as invalid. Also add a helper to allow the O_DIRECT code to simply mark the page cache as invalid once it is finished writing, instead of calling invalidate_inode_pages2() itself. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20[PATCH] OOM killer meets userspace headersAlexey Dobriyan
Despite mm.h is not being exported header, it does contain one thing which is part of userspace ABI -- value disabling OOM killer for given process. So, a) create and export include/linux/oom.h b) move OOM_DISABLE define there. c) turn bounding values of /proc/$PID/oom_adj into defines and export them too. Note: mass __KERNEL__ removal will be done later. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20[PATCH] genirq: clean up irq-flow-type naming, fixIngo Molnar
Re-add the set_irq_chip_and_handler() prototype, it's still widely used. Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Olaf Hering <olaf@aepfle.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20[PATCH] Make <linux/personality.h> userspace proofRalf Baechle
<linux/personality.h> contains the constants for personality(2) but also some defintions that are useless or even harmful in userspace such as the personality() macro. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20[PATCH] separate bdi congestion functions from queue congestion functionsAndrew Morton
Separate out the concept of "queue congestion" from "backing-dev congestion". Congestion is a backing-dev concept, not a queue concept. The blk_* congestion functions are retained, as wrappers around the core backing-dev congestion functions. This proper layering is needed so that NFS can cleanly use the congestion functions, and so that CONFIG_BLOCK=n actually links. Cc: "Thomas Maier" <balagi@justmail.de> Cc: "Jens Axboe" <jens.axboe@oracle.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: David Howells <dhowells@redhat.com> Cc: Peter Osterlund <petero2@telia.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20[PATCH] export clear_queue_congested and set_queue_congestedThomas Maier
Export the clear_queue_congested() and set_queue_congested() functions located in ll_rw_blk.c The functions are renamed to blk_clear_queue_congested() and blk_set_queue_congested(). (needed in the pktcdvd driver's bio write congestion control) Signed-off-by: Thomas Maier <balagi@justmail.de> Cc: Peter Osterlund <petero2@telia.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-19[PATCH] Add lockless helpers for remove_suid()Jens Axboe
Right now users have to grab i_mutex before calling remove_suid(), in the unlikely event that a call to ->setattr() may be needed. Split up the function in two parts: - One to check if we need to remove suid - One to actually remove it The first we can call lockless. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2006-10-19[PATCH] Introduce generic_file_splice_write_nolock()Mark Fasheh
This allows file systems to manage their own i_mutex locking while still re-using the generic_file_splice_write() logic. OCFS2 in particular wants this so that it can order cluster locks within i_mutex. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2006-10-19[PATCH] Take i_mutex in splice_from_pipe()Mark Fasheh
The splice_actor may be calling ->prepare_write() and ->commit_write(). We want i_mutex on the inode being written to before calling those so that we don't race i_size changes. The double locking behavior is done elsewhere in splice.c, and if we eventually want _nolock variants of generic_file_splice_write(), fs modules might have to replicate the nasty locking code. We introduce inode_double_lock() and inode_double_unlock() to consolidate the locking rules into one set of functions. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2006-10-18[TCP]: Bound TSO defer timeJohn Heffner
This patch limits the amount of time you will defer sending a TSO segment to less than two clock ticks, or the time between two acks, whichever is longer. On slow links, deferring causes significant bursts. See attached plots, which show RTT through a 1 Mbps link with a 100 ms RTT and ~100 ms queue for (a) non-TSO, (b) currnet TSO, and (c) patched TSO. This burstiness causes significant jitter, tends to overflow queues early (bad for short queues), and makes delay-based congestion control more difficult. Deferring by a couple clock ticks I believe will have a relatively small impact on performance. Signed-off-by: John Heffner <jheffner@psc.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-18[TIPC]: Added subscription cancellation capabilityLijun Chen
This patch allows a TIPC application to cancel an existing topology service subscription by re-requesting the subscription with the TIPC_SUB_CANCEL filter bit set. (All other bits of the cancel request must match the original subscription request.) Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Per Liden <per.liden@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-18Merge branch 'ubuntu-updates' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/bcollins/ubuntu-2.6 * 'ubuntu-updates' of master.kernel.org:/pub/scm/linux/kernel/git/bcollins/ubuntu-2.6: [pci_ids] Add Quicknet XJ vendor/device ID's. [valkyriefb] Ifdef for when CONFIG_NVRAM isn't enabled. [platinumfb] Ifdef for when CONFIG_NVRAM isn't enabled. [igafb] Add pci dev table for module auto loading. [controlfb] Ifdef for when CONFIG_NVRAM isn't enabled. [hid-core] TurboX Keyboard needs NOGET quirk. [ixj] Add pci dev table for module auto loading. [initio] Add pci dev table for module auto loading. [fdomain] Add pci dev table for module auto loading. [BusLogic] Add pci dev table for auto module loading. [mv643xx] Add pci device table for auto module loading. [alim7101] Add pci dev table for auto module loading.
2006-10-18PCI Hotplug: move pci_hotplug.h to include/linux/Greg Kroah-Hartman
This makes it possible to build pci hotplug drivers outside of the main kernel tree, and Sam keeps telling me to move local header files to their proper places... Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-10-18PCI: optionally sort device lists breadth-firstMatt Domsch
Problem: New Dell PowerEdge servers have 2 embedded ethernet ports, which are labeled NIC1 and NIC2 on the chassis, in the BIOS setup screens, and in the printed documentation. Assuming no other add-in ethernet ports in the system, Linux 2.4 kernels name these eth0 and eth1 respectively. Many people have come to expect this naming. Linux 2.6 kernels name these eth1 and eth0 respectively (backwards from expectations). I also have reports that various Sun and HP servers have similar behavior. Root cause: Linux 2.4 kernels walk the pci_devices list, which happens to be sorted in breadth-first order (or pcbios_find_device order on i386, which most often is breadth-first also). 2.6 kernels have both the pci_devices list and the pci_bus_type.klist_devices list, the latter is what is walked at driver load time to match the pci_id tables; this klist happens to be in depth-first order. On systems where, for physical routing reasons, NIC1 appears on a lower bus number than NIC2, but NIC2's bridge is discovered first in the depth-first ordering, NIC2 will be discovered before NIC1. If the list were sorted breadth-first, NIC1 would be discovered before NIC2. A PowerEdge 1955 system has the following topology which easily exhibits the difference between depth-first and breadth-first device lists. -[0000:00]-+-00.0 Intel Corporation 5000P Chipset Memory Controller Hub +-02.0-[0000:03-08]--+-00.0-[0000:04-07]--+-00.0-[0000:05-06]----00.0-[0000:06]----00.0 Broadcom Corporation NetXtreme II BCM5708S Gigabit Ethernet (labeled NIC2, 2.4 kernel name eth1, 2.6 kernel name eth0) +-1c.0-[0000:01-02]----00.0-[0000:02]----00.0 Broadcom Corporation NetXtreme II BCM5708S Gigabit Ethernet (labeled NIC1, 2.4 kernel name eth0, 2.6 kernel name eth1) Other factors, such as device driver load order and the presence of PCI slots at various points in the bus hierarchy further complicate this problem; I'm not trying to solve those here, just restore the device order, and thus basic behavior, that 2.4 kernels had. Solution: The solution can come in multiple steps. Suggested fix #1: kernel Patch below optionally sorts the two device lists into breadth-first ordering to maintain compatibility with 2.4 kernels. It adds two new command line options: pci=bfsort pci=nobfsort to force the sort order, or not, as you wish. It also adds DMI checks for the specific Dell systems which exhibit "backwards" ordering, to make them "right". Suggested fix #2: udev rules from userland Many people also have the expectation that embedded NICs are always discovered before add-in NICs (which this patch does not try to do). Using the PCI IRQ Routing Table provided by system BIOS, it's easy to determine which PCI devices are embedded, or if add-in, which PCI slot they're in. I'm working on a tool that would allow udev to name ethernet devices in ascending embedded, slot 1 .. slot N order, subsort by PCI bus/dev/fn breadth-first. It'll be possible to use it independent of udev as well for those distributions that don't use udev in their installers. Suggested fix #3: system board routing rules One can constrain the system board layout to put NIC1 ahead of NIC2 regardless of breadth-first or depth-first discovery order. This adds a significant level of complexity to board routing, and may not be possible in all instances (witness the above systems from several major manufacturers). I don't want to encourage this particular train of thought too far, at the expense of not doing #1 or #2 above. Feedback appreciated. Patch tested on a Dell PowerEdge 1955 blade with 2.6.18. You'll also note I took some liberty and temporarily break the klist abstraction to simplify and speed up the sort algorithm. I think that's both safe and appropriate in this instance. Signed-off-by: Matt Domsch <Matt_Domsch@dell.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-10-18pci: Additional search functionsAlan Cox
In order to finish converting to pci_get_* interfaces we need to add a couple of bits of missing functionaility pci_get_bus_and_slot() provides the equivalent to pci_find_slot() (pci_get_slot is already taken as a name for something similar but not the same) pci_get_device_reverse() is the equivalent of pci_find_device_reverse but refcounting Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>