aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2006-06-27Merge branch 'upstream-linus' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: [PATCH] ata_piix: add ICH6/7/8 to Kconfig [PATCH] sata_sil: disable hotplug interrupts on two ATI IXPs [PATCH] libata: cosmetic updates [PATCH] ata: add some NVIDIA chipset IDs [PATCH] libata reduce timeouts [PATCH] libata: implement ata_port_max_devices() [PATCH] libata: make two functions global [PATCH] libata: update ata_do_simple_cmd() [PATCH] libata: move ata_do_simple_cmd() below ata_exec_internal() [PATCH] libata: clear EH action on device detach [PATCH] libata: implement and use ata_deh_dev_action() [PATCH] libata: move ata_eh_clear_action() upward [PATCH] libata.h needs scatterlist.h [libata] sata_vsc: partially revert a PCI ID-related commit [libata] Bump versions
2006-06-27Properly delete sound/ppc/toonie.cLinus Torvalds
The previous "delete" had actually just truncated it to a zero size, something that can easily happen if you just apply a patch. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] i2c-i801.c: don't pci_disable_device() after it was just enabledDaniel Ritz
Commit 02dd7ae2892e5ceff111d032769c78d3377df970 ("[PATCH] i2c-i801: Merge setup function") has a missing return 0 in the _probe() function. This means the error path is always executed and pci_disable_device() is called even when the device just got successfully enabled. Having the SMBus device disabled makes some systems (eg. Fujitsu-Siemens Lifebook E8010) hang hard during power-off. Intead of reverting the whole commit this patch fixes it up: - don't ever call pci_disable_device(), also not in the _remove() function to avoid hangs - fix missing pci_release_region() in error path Signed-off-by: Daniel Ritz <daniel.ritz@gmx.ch> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: (25 commits) [CIFS] Fix authentication choice so we do not force NTLMv2 unless the [CIFS] Fix alignment of unicode strings in previous patch [CIFS] Fix allocation of buffers for new session setup routine to allow [CIFS] Remove calls to to take f_owner.lock [CIFS] remove some redundant null pointer checks [CIFS] Fix compile warning when CONFIG_CIFS_EXPERIMENTAL is off [CIFS] Enable sec flags on mount for cifs (part one) [CIFS] Fix suspend/resume problem which causes EIO on subsequent access to [CIFS] fix minor compile warning when config_cifs_weak_security is off [CIFS] NTLMv2 support part 5 [CIFS] Add support for readdir to legacy servers [CIFS] NTLMv2 support part 4 [CIFS] NTLMv2 support part 3 [CIFS] NTLMv2 support part 2 [CIFS] Fix mask so can set new cifs security flags properly CIFS] Support for older servers which require plaintext passwords - part 2 [CIFS] Support for older servers which require plaintext passwords [CIFS] Fix mapping of old SMB return code Invalid Net Name so it is [CIFS] Missing brace [CIFS] Do not overwrite aops ...
2006-06-27[PATCH] m68knommu: use Kconfig RAM config options in 68328 startup codeGreg Ungerer
Switch to using the new RAM Kconfig settings, instead of linker defined regions in ROM specific 68328 startup code. Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] m68knommu: use Kconfig RAM config options in 68360 ROM startup codeGreg Ungerer
Switch to using the new RAM Kconfig settings, instead of linker defined regions in ROM specific 68360 startup code. Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] m68knommu: use Kconfig RAM config options in 68360 RAM startup codeGreg Ungerer
Switch to using the new RAM Kconfig settings, instead of linker defined regions in RAM specific 68360 startup code. Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] m68knommu: remove NO_FORMAT_VECi from ptrace.h headerGreg Ungerer
Remove NO_FORMAT_VEC conditional check. It is not used or defined anywhere. Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] m68knommu: FEC driver event/irq fixesGreg Ungerer
Collection of fixes for the ColdFire FEC ethernet driver: . reworked event setting so that it occurs after the MII setup. roucaries bastien <roucaries.bastien@gmail.com> . Do not read cbd_sc in memory for each bit we test. Once per buffer is enough. . Overrun errors must increase `rx_fifo_errors', not `rx_crc_errors' . No need for a special value to activate rx or tx. Only write access matters. . Simplify parameter of eth_copy_and_sum : `data' has already the right value. . Some spelling fixes. Signed-off-by: Philippe De Muyter <phdm@macqel.be> Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] m68knommu: FEC driver set different priority/level on each IRQWillson Callan
Set different irq priority levels for each IRQ requested. According to the Freescale ColdFire documentation each separate IRQ must have its own unique priority/level combination. Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] m68knommu: FEC driver support for the ColdFire 523x CPU familyMatt Waddel
Add support for the FEC module in the ColdFire 532x CPU family. Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] m68knommu: avoid fec driver hang when link disappearsPhilippe De Muyter
Avoid requesting a `Graceful Transmit Stop' when link has disappeared, because that request cannot complete without link. Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] m68knommu: update m68knommu defconfnigGreg Ungerer
Updated defconfig for m68knommu arch. Includes recent changes to the clock and RAM configuration options. Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] m68knommu: build support for the Freescale 532x CPU familyMatt Waddel
Add build support for the M523x ColdFire CPU family. Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] m68knommu: build support for the Avnet/5282 boardDaniel Alomar
Add support for the Avnet/5282 board. Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] m68knommu: remove useless compiler argsPhilipe De Muyter
Here is a small patch that made my kernel .text segment shrink by 8k IIRC on my 5272-based board, by removing `-Wa,-S' from CFLAGS. The `-Wa,-S' option prevents `gas' from using short forms of jsr. Without it, `gas' replaces `jsr xxx.l' (6 bytes) by `jsr xxx@pc' (4 bytes) when possible. On 5272, both forms are equally fast. The `-Wa,-m5307' option is useless, because gcc already gives it to `gas' from the `-m5307' option. Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] voyager: add cpu_present_mapJames Bottomley
Voyager stopped booting some time in the 2.6.16-2.6.17 timeframe; the reason was that it doesn't have a cpu_present_map, so add one. Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27Merge master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvbLinus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (26 commits) V4L/DVB (4263): Fix warning when compiling on 64 bit machines V4L/DVB (4261): Included required header for in-kernel compilation V4L/DVB (4260): Stradis.c: make 2 functions static V4L/DVB (4259): Pass an explicit log prefix to cx2341x_log_status V4L/DVB (4257): Fix 64-bit compile warnings. V4L/DVB (4255): Tda9887 default TOP value is 0x10 V4L/DVB (4254): Remove obsoleted tuner_debug option. V4L/DVB (4253): IVTV VBI format description too long. V4L/DVB (4252): Remove duplicate 'tda9887' in info messages. V4L/DVB (4245): Reduce the amount of pvrusb2-sourced noise going into the system log V4L/DVB (4244): Implement use of cx2341x module in pvrusb2 driver V4L/DVB (4243): Exploit new V4L control features in pvrusb2 V4L/DVB (4242): Don't suspend encoder when changing its attributes (in pvrusb2) V4L/DVB (4241): Fix faulty encoder error recovery in pvrusb2 V4L/DVB (4240): Various V4L control enhancements in pvrusb2 V4L/DVB (4239): Handle boolean controls in pvrusb2 V4L/DVB (4238): Make sure flags field is initialized when quering a control in pvrusb2 V4L/DVB (4237): Move LOG_STATUS bracketing to a different part of the pvrusb2 driver V4L/DVB (4236): Rearrange things in pvrusb2 driver in preparation for using cx2341x module V4L/DVB (4235): Increase the maximum number of controls that pvrusb2-sysfs.c can handle. ...
2006-06-27[PATCH] i4l fix DLE masking in isdn_tty_try_readKarsten Keil
DLE masking was non-functional since the new tty handling. Found by Peter Evertz <leo2@pec.homeip.net> Signed-off-by: Karsten Keil <kkeil@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] do_IRQ() warning fixAndrew Morton
arch/i386/kernel/irq.c: In function 'do_IRQ': arch/i386/kernel/irq.c:104: warning: suggest parentheses around arithmetic in operand of | Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] drivers/message/i2o/iop.c: unexport i2o_msg_nop()Adrian Bunk
It's available in a header as a static inline - there's no need to export it. Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Markus Lidel <Markus.Lidel@shadowconnect.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] drivers/char/ipmi/ipmi_msghandler.c: make proc_ipmi_root staticAdrian Bunk
Make struct proc_ipmi_root static. Besides this, tremove removes an unused #ifdef CONFIG_PROC_FS from include/linux/ipmi.h. Acked-by: Corey Minyard <minyard@acm.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] Remove redundant NULL checks before [kv]free - in drivers/Jesper Juhl
Remove redundant NULL chck before kfree + tiny CodingStyle cleanup for drivers/ Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] Remove redundant NULL checks before [kv]free - in kernel/Jesper Juhl
Remove redundant kfree NULL checks from kernel/ Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] Remove redundant NULL checks before [kv]free - in fs/Jesper Juhl
Remove redundant NULL checks before kfree for fs/ Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Acked-by: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] futex_requeue() optimizationSebastien Dugue
In futex_requeue(), when the 2 futexes keys hash to the same bucket, there is no need to move the futex_q to the end of the bucket list. Signed-off-by: Sebastien Dugue <sebastien.dugue@bull.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] rtmutex: Propagate priority settings into PI lock chainsThomas Gleixner
When the priority of a task, which is blocked on a lock, changes we must propagate this change into the PI lock chain. Therefor the chain walk code is changed to get rid of the references to current to avoid false positives in the deadlock detector, as setscheduler might be called by a task which holds the lock on which the task whose priority is changed is blocked. Also add some comments about the get/put_task_struct usage to avoid confusion. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] rtmutex: Modify rtmutex-tester to test the setscheduler propagationThomas Gleixner
Make test suite setscheduler calls asynchronously. Remove the waits in the test cases and add a new testcase to verify the correctness of the setscheduler priority propagation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] Drop tasklist lock in do_sched_setschedulerThomas Gleixner
There is no need to hold tasklist_lock across the setscheduler call, when we pin the task structure with get_task_struct(). Interrupts are disabled in setscheduler anyway and the permission checks do not need interrupts disabled. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] fix rt-mutex defaults and dependenciesRoman Zippel
Fix defaults and dependencies. Signed-off-by: Roman Zippel <zippel@linux-m68k.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] pi-futex: futex_lock_pi/futex_unlock_pi supportIngo Molnar
This adds the actual pi-futex implementation, based on rt-mutexes. [dino@in.ibm.com: fix an oops-causing race] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Dinakar Guniguntala <dino@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] pi-futex: rt mutex futex apiIngo Molnar
Add proxy-locking rt-mutex functionality needed by pi-futexes. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] pi-futex: rt mutex testerThomas Gleixner
RT-mutex tester: scriptable tester for rt mutexes, which allows userspace scripting of mutex unit-tests (and dynamic tests as well), using the actual rt-mutex implementation of the kernel. [akpm@osdl.org: fixlet] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] pi-futex: rt mutex debugIngo Molnar
Runtime debugging functionality for rt-mutexes. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] pi-futex: rt mutex docsSteven Rostedt
Add rt-mutex documentation. [rostedt@goodmis.org: Update rt-mutex-design.txt as per Randy Dunlap suggestions] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: "Randy.Dunlap" <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] pi-futex: rt mutex coreIngo Molnar
Core functions for the rt-mutex subsystem. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] pi-futex: scheduler support for piIngo Molnar
Add framework to boost/unboost the priority of RT tasks. This consists of: - caching the 'normal' priority in ->normal_prio - providing a functions to set/get the priority of the task - make sched_setscheduler() aware of boosting The effective_prio() cleanups also fix a priority-calculation bug pointed out by Andrey Gelman, in set_user_nice(). has_rt_policy() fix: Peter Williams <pwil3058@bigpond.net.au> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Andrey Gelman <agelman@012.net.il> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] pi-futex: add plist implementationIngo Molnar
Add the priority-sorted list (plist) implementation. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] pi-futex: introduce WARN_ON_SMPIngo Molnar
Introduce a new WARN_ON variant: WARN_ON_SMP(cond). Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] pi-futex: introduce debug_check_no_locks_freed()Ingo Molnar
Add debug_check_no_locks_freed(), as a central inline to add bad-lock-free-debugging functionality to. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] pi-futex: robust futex docs fixIngo Molnar
Fix typo in Documentation/robust-futexes.txt. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] pi-futex: futex code cleanupsIngo Molnar
We are pleased to announce "lightweight userspace priority inheritance" (PI) support for futexes. The following patchset and glibc patch implements it, ontop of the robust-futexes patchset which is included in 2.6.16-mm1. We are calling it lightweight for 3 reasons: - in the user-space fastpath a PI-enabled futex involves no kernel work (or any other PI complexity) at all. No registration, no extra kernel calls - just pure fast atomic ops in userspace. - in the slowpath (in the lock-contention case), the system call and scheduling pattern is in fact better than that of normal futexes, due to the 'integrated' nature of FUTEX_LOCK_PI. [more about that further down] - the in-kernel PI implementation is streamlined around the mutex abstraction, with strict rules that keep the implementation relatively simple: only a single owner may own a lock (i.e. no read-write lock support), only the owner may unlock a lock, no recursive locking, etc. Priority Inheritance - why, oh why??? ------------------------------------- Many of you heard the horror stories about the evil PI code circling Linux for years, which makes no real sense at all and is only used by buggy applications and which has horrible overhead. Some of you have dreaded this very moment, when someone actually submits working PI code ;-) So why would we like to see PI support for futexes? We'd like to see it done purely for technological reasons. We dont think it's a buggy concept, we think it's useful functionality to offer to applications, which functionality cannot be achieved in other ways. We also think it's the right thing to do, and we think we've got the right arguments and the right numbers to prove that. We also believe that we can address all the counter-arguments as well. For these reasons (and the reasons outlined below) we are submitting this patch-set for upstream kernel inclusion. What are the benefits of PI? The short reply: ---------------- User-space PI helps achieving/improving determinism for user-space applications. In the best-case, it can help achieve determinism and well-bound latencies. Even in the worst-case, PI will improve the statistical distribution of locking related application delays. The longer reply: ----------------- Firstly, sharing locks between multiple tasks is a common programming technique that often cannot be replaced with lockless algorithms. As we can see it in the kernel [which is a quite complex program in itself], lockless structures are rather the exception than the norm - the current ratio of lockless vs. locky code for shared data structures is somewhere between 1:10 and 1:100. Lockless is hard, and the complexity of lockless algorithms often endangers to ability to do robust reviews of said code. I.e. critical RT apps often choose lock structures to protect critical data structures, instead of lockless algorithms. Furthermore, there are cases (like shared hardware, or other resource limits) where lockless access is mathematically impossible. Media players (such as Jack) are an example of reasonable application design with multiple tasks (with multiple priority levels) sharing short-held locks: for example, a highprio audio playback thread is combined with medium-prio construct-audio-data threads and low-prio display-colory-stuff threads. Add video and decoding to the mix and we've got even more priority levels. So once we accept that synchronization objects (locks) are an unavoidable fact of life, and once we accept that multi-task userspace apps have a very fair expectation of being able to use locks, we've got to think about how to offer the option of a deterministic locking implementation to user-space. Most of the technical counter-arguments against doing priority inheritance only apply to kernel-space locks. But user-space locks are different, there we cannot disable interrupts or make the task non-preemptible in a critical section, so the 'use spinlocks' argument does not apply (user-space spinlocks have the same priority inversion problems as other user-space locking constructs). Fact is, pretty much the only technique that currently enables good determinism for userspace locks (such as futex-based pthread mutexes) is priority inheritance: Currently (without PI), if a high-prio and a low-prio task shares a lock [this is a quite common scenario for most non-trivial RT applications], even if all critical sections are coded carefully to be deterministic (i.e. all critical sections are short in duration and only execute a limited number of instructions), the kernel cannot guarantee any deterministic execution of the high-prio task: any medium-priority task could preempt the low-prio task while it holds the shared lock and executes the critical section, and could delay it indefinitely. Implementation: --------------- As mentioned before, the userspace fastpath of PI-enabled pthread mutexes involves no kernel work at all - they behave quite similarly to normal futex-based locks: a 0 value means unlocked, and a value==TID means locked. (This is the same method as used by list-based robust futexes.) Userspace uses atomic ops to lock/unlock these mutexes without entering the kernel. To handle the slowpath, we have added two new futex ops: FUTEX_LOCK_PI FUTEX_UNLOCK_PI If the lock-acquire fastpath fails, [i.e. an atomic transition from 0 to TID fails], then FUTEX_LOCK_PI is called. The kernel does all the remaining work: if there is no futex-queue attached to the futex address yet then the code looks up the task that owns the futex [it has put its own TID into the futex value], and attaches a 'PI state' structure to the futex-queue. The pi_state includes an rt-mutex, which is a PI-aware, kernel-based synchronization object. The 'other' task is made the owner of the rt-mutex, and the FUTEX_WAITERS bit is atomically set in the futex value. Then this task tries to lock the rt-mutex, on which it blocks. Once it returns, it has the mutex acquired, and it sets the futex value to its own TID and returns. Userspace has no other work to perform - it now owns the lock, and futex value contains FUTEX_WAITERS|TID. If the unlock side fastpath succeeds, [i.e. userspace manages to do a TID -> 0 atomic transition of the futex value], then no kernel work is triggered. If the unlock fastpath fails (because the FUTEX_WAITERS bit is set), then FUTEX_UNLOCK_PI is called, and the kernel unlocks the futex on the behalf of userspace - and it also unlocks the attached pi_state->rt_mutex and thus wakes up any potential waiters. Note that under this approach, contrary to other PI-futex approaches, there is no prior 'registration' of a PI-futex. [which is not quite possible anyway, due to existing ABI properties of pthread mutexes.] Also, under this scheme, 'robustness' and 'PI' are two orthogonal properties of futexes, and all four combinations are possible: futex, robust-futex, PI-futex, robust+PI-futex. glibc support: -------------- Ulrich Drepper and Jakub Jelinek have written glibc support for PI-futexes (and robust futexes), enabling robust and PI (PTHREAD_PRIO_INHERIT) POSIX mutexes. (PTHREAD_PRIO_PROTECT support will be added later on too, no additional kernel changes are needed for that). [NOTE: The glibc patch is obviously inofficial and unsupported without matching upstream kernel functionality.] the patch-queue and the glibc patch can also be downloaded from: http://redhat.com/~mingo/PI-futex-patches/ Many thanks go to the people who helped us create this kernel feature: Steven Rostedt, Esben Nielsen, Benedikt Spranger, Daniel Walker, John Cooper, Arjan van de Ven, Oleg Nesterov and others. Credits for related prior projects goes to Dirk Grambow, Inaky Perez-Gonzalez, Bill Huey and many others. Clean up the futex code, before adding more features to it: - use u32 as the futex field type - that's the ABI - use __user and pointers to u32 instead of unsigned long - code style / comment style cleanups - rename hash-bucket name from 'bh' to 'hb'. I checked the pre and post futex.o object files to make sure this patch has no code effects. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Jakub Jelinek <jakub@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] BUG() if setscheduler is called from interrupt contextSteven Rostedt
Thomas Gleixner is adding the call to a rtmutex function in setscheduler. This call grabs a spin_lock that is not always protected by interrupts disabled. So this means that setscheduler cant be called from interrupt context. To prevent this from happening in the future, this patch adds a BUG_ON(in_interrupt()) in that function. (Thanks to akpm <aka. Andrew Morton> for this suggestion). Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] sched: uninline task_rq_lock()Oleg Nesterov
Saves 543 bytes from sched.o (gcc 3.3.3). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Ingo Molnar <mingo@elte.hu> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Con Kolivas <kernel@kolivas.org> Cc: Peter Williams <pwil3058@bigpond.net.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] sched: mc/smt power savings sched policySiddha, Suresh B
sysfs entries 'sched_mc_power_savings' and 'sched_smt_power_savings' in /sys/devices/system/cpu/ control the MC/SMT power savings policy for the scheduler. Based on the values (1-enable, 0-disable) for these controls, sched groups cpu power will be determined for different domains. When power savings policy is enabled and under light load conditions, scheduler will minimize the physical packages/cpu cores carrying the load and thus conserving power(with a perf impact based on the workload characteristics... see OLS 2005 CMP kernel scheduler paper for more details..) Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Con Kolivas <kernel@kolivas.org> Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] sched_domai: Allocate sched_group structures dynamicallySrivatsa Vaddagiri
As explained here: http://marc.theaimsgroup.com/?l=linux-kernel&m=114327539012323&w=2 there is a problem with sharing sched_group structures between two separate sched_group structures for different sched_domains. The patch has been tested and found to avoid the kernel lockup problem described in above URL. Signed-off-by: Srivatsa Vaddagiri <vatsa@in.ibm.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Ingo Molnar <mingo@elte.hu> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] sched_domai: Use kmalloc_nodeSrivatsa Vaddagiri
The sched group structures used to represent various nodes need to be allocated from respective nodes (as suggested here also: http://uwsg.ucs.indiana.edu/hypermail/linux/kernel/0603.3/0051.html) Signed-off-by: Srivatsa Vaddagiri <vatsa@in.ibm.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Ingo Molnar <mingo@elte.hu> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] sched_domai: Don't use GFP_ATOMICSrivatsa Vaddagiri
Replace GFP_ATOMIC allocation for sched_group_nodes with GFP_KERNEL based allocation. Signed-off-by: Srivatsa Vaddagiri <vatsa@in.ibm.com Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Ingo Molnar <mingo@elte.hu> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] sched_domain: handle kmalloc failureSrivatsa Vaddagiri
Try to handle mem allocation failures in build_sched_domains by bailing out and cleaning up thus-far allocated memory. The patch has a direct consequence that we disable load balancing completely (even at sibling level) upon *any* memory allocation failure. [Lee.Schermerhorn@hp.com: bugfix] Signed-off-by: Srivatsa Vaddagir <vatsa@in.ibm.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Ingo Molnar <mingo@elte.hu> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] sched: Avoid unnecessarily moving highest priority task move_tasks()Peter Williams
Problem: To help distribute high priority tasks evenly across the available CPUs move_tasks() does not, under some circumstances, skip tasks whose load weight is bigger than the designated amount. Because the highest priority task on the busiest queue may be on the expired array it may be moved as a result of this mechanism. Apart from not being the most desirable way to redistribute the high priority tasks (we'd rather move the second highest priority task), there is a risk that this could set up a loop with this task bouncing backwards and forwards between the two queues. (This latter possibility can be demonstrated by running a nice==-20 CPU bound task on an otherwise quiet 2 CPU system.) Solution: Modify the mechanism so that it does not override skip for the highest priority task on the CPU. Of course, if there are more than one tasks at the highest priority then it will allow the override for one of them as this is a desirable redistribution of high priority tasks. Signed-off-by: Peter Williams <pwil3058@bigpond.com.au> Cc: Ingo Molnar <mingo@elte.hu> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>