Kernel - My Linux kernel repository

Age	Commit message (Collapse)	Author
2005-09-07	[PATCH] auxiliary vector cleanups	H. J. Lu
	The size of auxiliary vector is fixed at 42 in linux/sched.h. But it isn't very obvious when looking at linux/elf.h. This patch adds AT_VECTOR_SIZE so that we can change it if necessary when a new vector is added. Because of include file ordering problems, doing this necessitated the extraction of the AT_* symbols into a standalone header file. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07	[PATCH] compat: be more consistent about [ug]id_t	Stephen Rothwell
	When I first wrote the compat layer patches, I was somewhat cavalier about the definition of compat_uid_t and compat_gid_t (or maybe I just misunderstood :-)). This patch makes the compat types much more consistent with the types we are being compatible with and hopefully will fix a few bugs along the way. compat type type in compat arch __compat_[ug]id_t __kernel_[ug]id_t __compat_[ug]id32_t __kernel_[ug]id32_t compat_[ug]id_t [ug]id_t The difference is that compat_uid_t is always 32 bits (for the archs we care about) but __compat_uid_t may be 16 bits on some. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07	[PATCH] FUTEX_WAKE_OP: pthread_cond_signal() speedup	Jakub Jelinek
	ATM pthread_cond_signal is unnecessarily slow, because it wakes one waiter (which at least on UP usually means an immediate context switch to one of the waiter threads). This waiter wakes up and after a few instructions it attempts to acquire the cv internal lock, but that lock is still held by the thread calling pthread_cond_signal. So it goes to sleep and eventually the signalling thread is scheduled in, unlocks the internal lock and wakes the waiter again. Now, before 2003-09-21 NPTL was using FUTEX_REQUEUE in pthread_cond_signal to avoid this performance issue, but it was removed when locks were redesigned to the 3 state scheme (unlocked, locked uncontended, locked contended). Following scenario shows why simply using FUTEX_REQUEUE in pthread_cond_signal together with using lll_mutex_unlock_force in place of lll_mutex_unlock is not enough and probably why it has been disabled at that time: The number is value in cv->__data.__lock. thr1 thr2 thr3 0 pthread_cond_wait 1 lll_mutex_lock (cv->__data.__lock) 0 lll_mutex_unlock (cv->__data.__lock) 0 lll_futex_wait (&cv->__data.__futex, futexval) 0 pthread_cond_signal 1 lll_mutex_lock (cv->__data.__lock) 1 pthread_cond_signal 2 lll_mutex_lock (cv->__data.__lock) 2 lll_futex_wait (&cv->__data.__lock, 2) 2 lll_futex_requeue (&cv->__data.__futex, 0, 1, &cv->__data.__lock) # FUTEX_REQUEUE, not FUTEX_CMP_REQUEUE 2 lll_mutex_unlock_force (cv->__data.__lock) 0 cv->__data.__lock = 0 0 lll_futex_wake (&cv->__data.__lock, 1) 1 lll_mutex_lock (cv->__data.__lock) 0 lll_mutex_unlock (cv->__data.__lock) # Here, lll_mutex_unlock doesn't know there are threads waiting # on the internal cv's lock Now, I believe it is possible to use FUTEX_REQUEUE in pthread_cond_signal, but it will cost us not one, but 2 extra syscalls and, what's worse, one of these extra syscalls will be done for every single waiting loop in pthread_cond_wait. We would need to use lll_mutex_unlock_force in pthread_cond_signal after requeue and lll_mutex_cond_lock in pthread_cond_wait after lll_futex_wait. Another alternative is to do the unlocking pthread_cond_signal needs to do (the lock can't be unlocked before lll_futex_wake, as that is racy) in the kernel. I have implemented both variants, futex-requeue-glibc.patch is the first one and futex-wake_op{,-glibc}.patch is the unlocking inside of the kernel. The kernel interface allows userland to specify how exactly an unlocking operation should look like (some atomic arithmetic operation with optional constant argument and comparison of the previous futex value with another constant). It has been implemented just for ppc*, x86_64 and i?86, for other architectures I'm including just a stub header which can be used as a starting point by maintainers to write support for their arches and ATM will just return -ENOSYS for FUTEX_WAKE_OP. The requeue patch has been (lightly) tested just on x86_64, the wake_op patch on ppc64 kernel running 32-bit and 64-bit NPTL and x86_64 kernel running 32-bit and 64-bit NPTL. With the following benchmark on UP x86-64 I get: for i in nptl-orig nptl-requeue nptl-wake_op; do echo time elf/ld.so --library-path .:$i /tmp/bench; \ for j in 1 2; do echo ( time elf/ld.so --library-path .:$i /tmp/bench ) 2>&1; done; done time elf/ld.so --library-path .:nptl-orig /tmp/bench real 0m0.655s user 0m0.253s sys 0m0.403s real 0m0.657s user 0m0.269s sys 0m0.388s time elf/ld.so --library-path .:nptl-requeue /tmp/bench real 0m0.496s user 0m0.225s sys 0m0.271s real 0m0.531s user 0m0.242s sys 0m0.288s time elf/ld.so --library-path .:nptl-wake_op /tmp/bench real 0m0.380s user 0m0.176s sys 0m0.204s real 0m0.382s user 0m0.175s sys 0m0.207s The benchmark is at: http://sourceware.org/ml/libc-alpha/2005-03/txt00001.txt Older futex-requeue-glibc.patch version is at: http://sourceware.org/ml/libc-alpha/2005-03/txt00002.txt Older futex-wake_op-glibc.patch version is at: http://sourceware.org/ml/libc-alpha/2005-03/txt00003.txt Will post a new version (just x86-64 fixes so that the patch applies against pthread_cond_signal.S) to libc-hacker ml soon. Attached is the kernel FUTEX_WAKE_OP patch as well as a simple-minded testcase that will not test the atomicity of the operation, but at least check if the threads that should have been woken up are woken up and whether the arithmetic operation in the kernel gave the expected results. Acked-by: Ingo Molnar <mingo@redhat.com> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Jamie Lokier <jamie@shareable.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Yoichi Yuasa <yuasa@hh.iij4u.or.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07	[PATCH] x86_64: prefetchw() can fall back to prefetch() if !3DNOW	Eric Dumazet
	This is a multi-part message in MIME format. If the cpu lacks 3DNOW feature, we can use a normal prefetcht0 instruction instead of NOP5. "prefetchw (%rxx)" and "prefetcht0 (%rxx)" have the same length, ranging from 3 to 5 bytes depending on the register. So this patch even helps AMD64, shortening the length of the code. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05	[PATCH] i386: cleanup serialize msr	Zachary Amsden
	i386 arch cleanup. Introduce the serialize macro to serialize processor state. Why the microcode update needs it I am not quite sure, since wrmsr() is already a serializing instruction, but it is a microcode update, so I will keep the semantic the same, since this could be a timing workaround. As far as I can tell, this has always been there since the original microcode update source. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05	[PATCH] x86_64: avoid some atomic operations during address space destruction	Zachary Amsden
	Any architecture that has hardware updated A/D bits that require synchronization against other processors during PTE operations can benefit from doing non-atomic PTE updates during address space destruction. Originally done on i386, now ported to x86_64. Doing a read/write pair instead of an xchg() operation saves the implicit lock, which turns out to be a big win on 32-bit (esp w PAE). Signed-off-by: Zachary Amsden <zach@vmware.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05	[PATCH] sab: consolidate kmem_bufctl_t	Kyle Moffett
	This is used only in slab.c and each architecture gets to define whcih underlying type is to be used. Seems a bit silly - move it to slab.c and use the same type for all architectures: unsigned int. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05	[PATCH] remove hugetlb_clean_stale_pgtable() and fix huge_pte_alloc()	Chen, Kenneth W
	I don't think we need to call hugetlb_clean_stale_pgtable() anymore in 2.6.13 because of the rework with free_pgtables(). It now collect all the pte page at the time of munmap. It used to only collect page table pages when entire one pgd can be freed and left with staled pte pages. Not anymore with 2.6.13. This function will never be called and We should turn it into a BUG_ON. I also spotted two problems here, not Adam's fault :-) (1) in huge_pte_alloc(), it looks like a bug to me that pud is not checked before calling pmd_alloc() (2) in hugetlb_clean_stale_pgtable(), it also missed a call to pmd_free_tlb. I think a tlb flush is required to flush the mapping for the page table itself when we clear out the pmd pointing to a pte page. However, since hugetlb_clean_stale_pgtable() is never called, so it won't trigger the bug. Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Cc: Adam Litke <agl@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05	[PATCH] hugetlb: add pte_huge() macro	Adam Litke
	This patch adds a macro pte_huge(pte) for i386/x86_64 which is needed by a patch later in the series. Instead of repeating (_PAGE_PRESENT \| _PAGE_PSE), I've added __LARGE_PTE to i386 to match x86_64. Signed-off-by: Adam Litke <agl@us.ibm.com> Cc: <linux-mm@kvack.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05	[PATCH] mm: correct _PAGE_FILE comment	Paolo 'Blaisorblade' Giarrusso
	_PAGE_FILE does not indicate whether a file is in page / swap cache, it is set just for non-linear PTE's. Correct the comment for i386, x86_64, UML. Also clearify _PAGE_NONE. Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05	[PATCH] mm: consolidate get_order	Stephen Rothwell
	Someone mentioned that almost all the architectures used basically the same implementation of get_order. This patch consolidates them into asm-generic/page.h and includes that in the appropriate places. The exceptions are ia64 and ppc which have their own (presumably optimised) versions. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-29	[NET]: Fix ipl=>ihl typo in ip_fast_csum	Thomas Graf
	Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29	[NET]: Introduce SO_{SND,RCV}BUFFORCE socket options	Patrick McHardy
	Allows overriding of sysctl_{wmem,rmrm}_max Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-26	[PATCH] x86_64: Tell VM about holes in nodes	Andi Kleen
	Some nodes can have large holes on x86-64. This fixes problems with the VM allowing too many dirty pages because it overestimates the number of available RAM in a node. In extreme cases you can end up with all RAM filled with dirty pages which can lead to deadlocks and other nasty behaviour. This patch just tells the VM about the known holes from e820. Reserved (like the kernel text or mem_map) is still not taken into account, but that should be only a few percent error now. Small detail is that the flat setup uses the NUMA free_area_init_node() now too because it offers more flexibility. (akpm: lotsa thanks to Martin for working this problem out) Cc: Martin Bligh <mbligh@mbligh.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-16	[PATCH] i386 / desc_empty macro is incorrect	Zachary Amsden
	Chuck Ebbert noticed that the desc_empty macro is incorrect. Fix it. Thankfully, this is not used as a security check, but it can falsely overwrite TLS segments with carefully chosen base / limits. I do not believe this is an issue in practice, but it is a kernel bug. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Chris Wright <chrisw@osdl.org> [ x86-64 had the same problem, and the same fix. Linus ] Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-14	Revert PCIBIOS_MIN_IO changes for 2.6.13	Linus Torvalds
	This reverts commits 71db63acff69618b3d9d3114bd061938150e146b [PATCH] increase PCIBIOS_MIN_IO on x86 and 0b2bfb4e7ff61f286676867c3508569bea6fbf7a ACPI: increase PCIBIOS_MIN_IO on x86 since Lukas Sandströ<lukass@etek.chalmers.se> reports that this breaks his on-board nvidia audio. We should re-visit this later. For now we revert the change Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-02	[PATCH] increase PCIBIOS_MIN_IO on x86	Ivan Kokshaysky
	There is a number of x86 laptops that have some non-PCI IO ports in the 0x1000-0x1fff range, and it's quite hard to control the correct order of resource allocation between PCI and other subsystems controlling these ports. Especially with modular kernel. So just increase PCIBIOS_MIN_IO to 0x4000 to prevent any new PCI resource allocations in the problematic range (this limitation must apply _only_ to the root bus resources - see Linus' change in pci_bus_alloc_resource). As PCIBIOS_MIN_IO and PCIBIOS_MIN_CARDBUS_IO are the same now on i386 and x86-64, we can remove the latter. Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-29	[PATCH] Fix sync_tsc hang	Eric W. Biederman
	sync_tsc was using smp_call_function to ask the boot processor to report it's tsc value. smp_call_function performs an IPI_send_allbutself which is a broadcast ipi. There is a window during processor startup during which the target cpu has started and before it has initialized it's interrupt vectors so it can properly process an interrupt. Receveing an interrupt during that window will triple fault the cpu and do other nasty things. Why cli does not protect us from that is beyond me. The simple fix is to match ia64 and provide a smp_call_function_single. Which avoids the broadcast and is more efficient. This certainly fixes the problem of getting stuck on boot which was very easy to trigger on my SMP Hyperthreaded Xeon, and I think it fixes it for the right reasons. Minor changes by AK Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-29	[PATCH] x86_64 machine_kexec: Use standard pagetable helpers	Eric W. Biederman
	Use the standard hardware page table manipulation macros. This is possible now that linux works with all 4 levels of the page tables. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-28	[PATCH] x86_64: Fix gcc 4 warning in sched_find_first_bit	Jesse Millan
	This patch eliminates the GCC4 warning on the x86_64 platform: kernel/sched.c:1824: warning: control may reach end of non-void function 'sched_find_first_bit' being inlined. The change follows the lead of others, i.e. it is guaranteed that at least one of b[0], b[1], or b[2] will have a bit set and evaluate to true. That being said, GCC4.0.0 notices that the code flow does not return anything if b[0], b[1] and b[2] are not true. Since we know better, if it's not b[0] or b[1], it has to be b[2]. Signed-off-by: Jesse Millan <jessem@cs.pdx.edu> Signed-off-by: Domen Puncer <domen@coderock.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-28	[PATCH] x86_64: Switch to the interrupt stack when running a softirq in ↵	Andi Kleen
	local_bh_enable() This avoids some potential stack overflows with very deep softirq callchains. i386 does this too. TOADD CFI annotation Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-28	[PATCH] x86_64: Turn BUG data into valid instruction	Andi Kleen
	This avoids confusing the disassembler. Costs 2 bytes per BUG. Thanks to Suresh Siddha and Jan Beulich for suggesting suitable instructions. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-28	[PATCH] x86_64: Fix incorrectly defined MSR_K8_SYSCFG	Andi Kleen
	Harmless because the kernel didn't use it. Noticed by Travis Betak Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-28	[PATCH] x86_64: Fix some typos in system.h comments	Andi Kleen
	Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-28	[PATCH] x86_64: Remove obsolete eat_key prototype	Andi Kleen
	Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-28	[PATCH] x86_64: Fix some comments in tlbflush.h	Andi Kleen
	Were either outdated or misleading. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-28	[PATCH] x86_64: Some cleanup in setup64.c	Andi Kleen
	Minor cleanup. Move things into their include files, remove obsolete includes, fix indentation, remove obsolete special cases etc. I also added the per cpu section to asm-generic/sections.h and fixed init/main.c to use it. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-28	[PATCH] x86_64: i386/x86_64: remove prototypes for not existing functions in ↵	Andi Kleen
	smp.h Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-28	[PATCH] x86_64: Use for_each_cpu_mask for clustered IPI flush	Andi Kleen
	Makes it slightly more efficient. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26	[PATCH] x86_64: Implemenent machine_emergency_restart	Eric W. Biederman
	It is not safe to call set_cpus_allowed() in interrupt context and disabling the apics is complicated code. So unconditionally skip machine_shutdown in machine_emergency_reboot on x86_64. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26	[PATCH] Add emergency_restart()	Eric W. Biederman
	When the kernel is working well and we want to restart cleanly kernel_restart is the function to use. But in many instances the kernel wants to reboot when thing are expected to be working very badly such as from panic or a software watchdog handler. This patch adds the function emergency_restart() so that callers can be clear what semantics they expect when calling restart. emergency_restart() is expected to be callable from interrupt context and possibly reliable in even more trying circumstances. This is an initial generic implementation for all architectures. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26	[PATCH] inotify: add x86-64 syscall entries	Robert Love
	Add inotify syscall entries to x86-64. Signed-off-by: Robert Love <rml@novell.com> Signed-off-by: John McCutchan <ttb@tentacle.dhs.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12	[ACPI] merge acpi-2.6.12 branch into latest Linux 2.6.13-rc...	Len Brown
	Signed-off-by: Len Brown <len.brown@intel.com>
2005-07-12	[ACPI] enable C2 and C3 idle power states on SMP	Venkatesh Pallipadi
	http://bugzilla.kernel.org/show_bug.cgi?id=4401 Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2005-07-12	[ACPI] PNPACPI vs sound IRQ	David Shaohua Li
	http://bugme.osdl.org/show_bug.cgi?id=4016 Written-by: David Shaohua Li <shaohua.li@intel.com> Acked-by: Adam Belay <abelay@novell.com> Signed-off-by: Len Brown <len.brown@intel.com>
2005-07-07	[PATCH] MTRR suspend/resume cleanup	Shaohua Li
	There has been some discuss about solving the SMP MTRR suspend/resume breakage, but I didn't find a patch for it. This is an intent for it. The basic idea is moving mtrr initializing into cpu_identify for all APs (so it works for cpu hotplug). For BP, restore_processor_state is responsible for restoring MTRR. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Acked-by: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-30	[PATCH] x86: i8253/i8259A lock cleanup	Ingo Molnar
	Introduce proper declarations for i8253_lock and i8259A_lock. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-29	[PATCH] Serial: Split 8250 port table (part 2)	Russell King
	Remove legacy ISA serial ports for Accent, Boca, Fourport, Hub6 and MCA from the architecture specific serial.h include. The only ports which remain in asm-*/serial.h are the platform specific entries. These should really be converted by platform maintainers to use a platform device, such as can be found in arch/arm/mach-footbridge/isa.c Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-27	Merge rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6	Greg KH

2005-06-27	[PATCH] PCI: fix up errors after dma bursting patch and CONFIG_PCI=n	Andrew Morton
	With CONFIG_PCI=n: In file included from include/linux/pci.h:917, from lib/iomap.c:6: include/asm/pci.h:104: warning: `enum pci_dma_burst_strategy' declared inside parameter list include/asm/pci.h:104: warning: its scope is only this definition or declaration, which is probably not what you want. include/asm/pci.h: In function `pci_dma_burst_advice': include/asm/pci.h:106: dereferencing pointer to incomplete type include/asm/pci.h:106: `PCI_DMA_BURST_INFINITY' undeclared (first use in this function) include/asm/pci.h:106: (Each undeclared identifier is reported only once include/asm/pci.h:106: for each function it appears in.) make[1]: *** [lib/iomap.o] Error 1 Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-27	[PATCH] PCI: DMA bursting advice	David S. Miller
	After seeing, at best, "guesses" as to the following kind of information in several drivers, I decided that we really need a way for platforms to specifically give advice in this area for what works best with their PCI controller implementation. Basically, this new interface gives DMA bursting advice on PCI. There are three forms of the advice: 1) Burst as much as possible, it is not necessary to end bursts on some particular boundary for best performance. 2) Burst on some byte count multiple. A DMA burst to some multiple of number of bytes may be done, but it is important to end the burst on an exact multiple for best performance. The best example of this I am aware of are the PPC64 PCI controllers, where if you end a burst mid-cacheline then chip has to refetch the data and the IOMMU translations which hurts performance a lot. 3) Burst on a single byte count multiple. Bursts shall end exactly on the next multiple boundary for best performance. Sparc64 and Alpha's PCI controllers operate this way. They disconnect any device which tries to burst across a cacheline boundary. Actually, newer sparc64 PCI controllers do not have this behavior. That is why the "pdev" is passed into the interface, so I can add code later to check which PCI controller the system is using and give advice accordingly. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-27	[PATCH] seccomp: tsc disable	Andrea Arcangeli
	I believe at least for seccomp it's worth to turn off the tsc, not just for HT but for the L2 cache too. So it's up to you, either you turn it off completely (which isn't very nice IMHO) or I recommend to apply this below patch. This has been tested successfully on x86-64 against current cogito repository (i686 compiles so I didn't bother testing ;). People selling the cpu through cpushare may appreciate this bit for a peace of mind. There's no way to get any timing info anymore with this applied (gettimeofday is forbidden of course). The seccomp environment is completely deterministic so it can't be allowed to get timing info, it has to be deterministic so in the future I can enable a computing mode that does a parallel computing for each task with server side transparent checkpointing and verification that the output is the same from all the 2/3 seller computers for each task, without the buyer even noticing (for now the verification is left to the buyer client side and there's no checkpointing, since that would require more kernel changes to track the dirty bits but it'll be easy to extend once the basic mode is finished). Eliminating a cold-cache read of the cr4 global variable will save one cacheline during the tlb flush while making the code per-cpu-safe at the same time. Thanks to Mikael Pettersson for noticing the tlb flush wasn't per-cpu-safe. The global tlb flush can run from irq (IPI calling do_flush_tlb_all) but it'll be transparent to the switch_to code since the IPI won't make any change to the cr4 contents from the point of view of the interrupted code and since it's now all per-cpu stuff, it will not race. So no need to disable irqs in switch_to slow path. Signed-off-by: Andrea Arcangeli <andrea@cpushare.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27	[PATCH] Update cfq io scheduler to time sliced design	Jens Axboe
	This updates the CFQ io scheduler to the new time sliced design (cfq v3). It provides full process fairness, while giving excellent aggregate system throughput even for many competing processes. It supports io priorities, either inherited from the cpu nice value or set directly with the ioprio_get/set syscalls. The latter closely mimic set/getpriority. This import is based on my latest from -mm. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25	[PATCH] Kdump: Export crash notes section address through sysfs	Vivek Goyal
	o Following patch exports kexec global variable "crash_notes" to user space through sysfs as kernel attribute in /sys/kernel. Signed-off-by: Maneesh Soni <maneesh@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25	[PATCH] kexec: x86_64 kexec implementation	Eric W. Biederman
	This is the x86_64 implementation of machine kexec. 32bit compatibility support has been implemented, and machine_kexec has been enhanced to not care about the changing internal kernel paget table structures. From: Alexander Nyberg <alexn@dsv.su.se> build fix Signed-off-by: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25	[PATCH] kexec: x86_64: add CONFIG_PHYSICAL_START	Eric W. Biederman
	For one kernel to report a crash another kernel has created we need to have 2 kernels loaded simultaneously in memory. To accomplish this the two kernels need to built to run at different physical addresses. This patch adds the CONFIG_PHYSICAL_START option to the x86_64 kernel so we can do just that. You need to know what you are doing and the ramifications are before changing this value, and most users won't care so I have made it depend on CONFIG_EMBEDDED bzImage kernels will work and run at a different address when compiled with this option but they will still load at 1MB. If you need a kernel loaded at a different address as well you need to boot a vmlinux. Signed-off-by: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25	[PATCH] kexec: x86_64: restore apic virtual wire mode on shutdown	Eric W. Biederman
	When coming out of apic mode attempt to set the appropriate apic back into virtual wire mode. This improves on previous versions of this patch by by never setting bot the local apic and the ioapic into veritual wire mode. This code looks at data from the mptable to see if an ioapic has an ExtInt input to make this decision. A future improvement is to figure out which apic or ioapic was in virtual wire mode at boot time and to remember it. That is potentially a more accurate method, of selecting which apic to place in virutal wire mode. Signed-off-by: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25	[PATCH] kexec: x86: rename APIC_MODE_EXINT	Eric W. Biederman
	From: "Maciej W. Rozycki" <macro@linux-mips.org> Rename APIC_MODE_EXINT to APIC_MODE_EXTINT - I think it should be named after what the mode is called in documentation. From: "Eric W. Biederman" <ebiederm@lnxi.com> I have reduced this patch to just the name change in the header. And integrated the changes into the patches that add those lines. Otherwise I ran into some ugly dependencies. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org Signed-off-by: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25	[PATCH] sched: sched tuning	Nick Piggin
	Do some basic initial tuning. Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25	[PATCH] sched: balance on fork	Nick Piggin
	Reimplement the balance on exec balancing to be sched-domains aware. Use this to also do balance on fork balancing. Make x86_64 do balance on fork over the NUMA domain. The problem that the non sched domains aware blancing became apparent on dual core, multi socket opterons. What we want is for the new tasks to be sent to a different socket, but more often than not, we would first load up our sibling core, or fill two cores of a single remote socket before selecting a new one. This gives large improvements to STREAM on such systems. Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>