aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2009-09-10sched: Disable NEW_FAIR_SLEEPERS for nowIngo Molnar
Nikos Chantziaras and Jens Axboe reported that turning off NEW_FAIR_SLEEPERS improves desktop interactivity visibly. Nikos described his experiences the following way: " With this setting, I can do "nice -n 19 make -j20" and still have a very smooth desktop and watch a movie at the same time. Various other annoyances (like the "logout/shutdown/restart" dialog of KDE not appearing at all until the background fade-out effect has finished) are also gone. So this seems to be the single most important setting that vastly improves desktop behavior, at least here. " Jens described it the following way, referring to a 10-seconds xmodmap scheduling delay he was trying to debug: " Then I tried switching NO_NEW_FAIR_SLEEPERS on, and then I get: Performance counter stats for 'xmodmap .xmodmap-carl': 9.009137 task-clock-msecs # 0.447 CPUs 18 context-switches # 0.002 M/sec 1 CPU-migrations # 0.000 M/sec 315 page-faults # 0.035 M/sec 0.020167093 seconds time elapsed Woot! " So disable it for now. In perf trace output i can see weird delta timestamps: cc1-9943 [001] 2802.059479616: sched_stat_wait: task: as:9944 wait: 2801938766276 [ns] That nsec field is not supposed to be that large. More digging is needed - but lets turn it off while the real bug is found. Reported-by: Nikos Chantziaras <realnc@arcor.de> Tested-by: Nikos Chantziaras <realnc@arcor.de> Reported-by: Jens Axboe <jens.axboe@oracle.com> Tested-by: Jens Axboe <jens.axboe@oracle.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <4AA93D34.8040500@arcor.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-09sched: Keep kthreads at default priorityMike Galbraith
Removes kthread/workqueue priority boost, they increase worst-case desktop latencies. Signed-off-by: Mike Galbraith <efault@gmx.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1252486344.28645.18.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-09sched: Re-tune the scheduler latency defaults to decrease worst-case latenciesMike Galbraith
Reduce the latency target from 20 msecs to 5 msecs. Why? Larger latencies increase spread, which is good for scaling, but bad for worst case latency. We still have the ilog(nr_cpus) rule to scale up on bigger server boxes. Signed-off-by: Mike Galbraith <efault@gmx.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1252486344.28645.18.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-09sched: Turn off child_runs_firstMike Galbraith
Set child_runs_first default to off. It hurts 'optimal' make -j<NR_CPUS> workloads as make jobs get preempted by child tasks, reducing parallelism. Note, this patch might make existing races in user applications more prominent than before - so breakages might be bisected to this commit. Child-runs-first is broken on SMP to begin with, and we already had it off briefly in v2.6.23 so most of the offenders ought to be fixed. Would be nice not to revert this commit but fix those apps finally ... Signed-off-by: Mike Galbraith <efault@gmx.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1252486344.28645.18.camel@marge.simson.net> [ made the sysctl independent of CONFIG_SCHED_DEBUG, in case people want to work around broken apps. ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-08sched: Ensure that a child can't gain time over it's parent after fork()Mike Galbraith
A fork/exec load is usually "pass the baton", so the child should never be placed behind the parent. With START_DEBIT we make room for the new task, but with child_runs_first, that room comes out of the _parent's_ hide. There's nothing to say that the parent wasn't ahead of min_vruntime at fork() time, which means that the "baton carrier", who is essentially the parent in drag, can gain time and increase scheduling latencies for waiters. With NEW_FAIR_SLEEPERS + START_DEBIT + child_runs_first enabled, we essentially pass the sleeper fairness off to the child, which is fine, but if we don't base placement on the parent's updated vruntime, we can end up compounding latency woes if the child itself then does fork/exec. The debit incurred at fork doesn't hurt the parent who is then going to sleep and maybe exit, but the child who acquires the error harms all comers. This improves latencies of make -j<n> kernel build workloads. Reported-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Mike Galbraith <efault@gmx.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-07sched: enable SD_WAKE_IDLEPeter Zijlstra
Now that SD_WAKE_IDLE doesn't make pipe-test suck anymore, enable it by default for MC, CPU and NUMA domains. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-07sched: Deal with low-load in wake_affine()Peter Zijlstra
wake_affine() would always fail under low-load situations where both prev and this were idle, because adding a single task will always be a significant imbalance, even if there's nothing around that could balance it. Deal with this by allowing imbalance when there's nothing you can do about it. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-07sched: Remove short cut from select_task_rq_fair()Peter Zijlstra
select_task_rq_fair() incorrectly skips the wake_affine() logic, remove this. When prev_cpu == this_cpu, the code jumps straight to the wake_idle() logic, this doesn't give the wake_affine() logic the chance to pin the task to this cpu. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-04sched: Turn on SD_BALANCE_NEWIDLEIngo Molnar
Start the re-tuning of the balancer by turning on newidle. It improves hackbench performance and parallelism on a 4x4 box. The "perf stat --repeat 10" measurements give us: domain0 domain1 ....................................... -SD_BALANCE_NEWIDLE -SD_BALANCE_NEWIDLE: 2041.273208 task-clock-msecs # 9.354 CPUs ( +- 0.363% ) +SD_BALANCE_NEWIDLE -SD_BALANCE_NEWIDLE: 2086.326925 task-clock-msecs # 11.934 CPUs ( +- 0.301% ) +SD_BALANCE_NEWIDLE +SD_BALANCE_NEWIDLE: 2115.289791 task-clock-msecs # 12.158 CPUs ( +- 0.263% ) Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-04sched: Clean up topology.hIngo Molnar
Re-organize the flag settings so that it's visible at a glance which sched-domains flags are set and which not. With the new balancer code we'll need to re-tune these details anyway, so make it cleaner to make fewer mistakes down the road ;-) Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-04sched: Fix dynamic power-balancing crashIngo Molnar
This crash: [ 1774.088275] divide error: 0000 [#1] SMP [ 1774.100355] CPU 13 [ 1774.102498] Modules linked in: [ 1774.105631] Pid: 30881, comm: hackbench Not tainted 2.6.31-rc8-tip-01308-g484d664-dirty #1629 X8DTN [ 1774.114807] RIP: 0010:[<ffffffff81041c38>] [<ffffffff81041c38>] sched_balance_self+0x19b/0x2d4 Triggers because update_group_power() modifies the sd tree and does temporary calculations there - not considering that other CPUs could observe intermediate values, such as the zero initial value. Calculate it in a temporary variable instead. (we need no memory barrier as these are all statistical values anyway) Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20090904092742.GA11014@elte.hu> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-04sched: Remove reciprocal for cpu_powerPeter Zijlstra
Its a source of fail, also, now that cpu_power is dynamical, its a waste of time. before: <idle>-0 [000] 132.877936: find_busiest_group: avg_load: 0 group_load: 8241 power: 1 after: bash-1689 [001] 137.862151: find_busiest_group: avg_load: 10636288 group_load: 10387 power: 1 [ v2: build fix from From: Andreas Herrmann ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Gautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <20090901083826.425896304@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-04sched: Try to deal with low capacity, fix update_sd_power_savings_stats()Gautham R Shenoy
sgs.group_capacity can now be 0, if for some reason group->__cpu_power happens to be less than SCHED_LOAD_SCALE/2. In that case, we need the following fix to make it work for update_sd_power_savings_stats(). That's because both sum_nr_running and group_capacity are unsigned longs. Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-04sched: Try to deal with low capacityPeter Zijlstra
When the capacity drops low, we want to migrate load away. Allow the load-balancer to remove all tasks when we hit rock bottom. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Gautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <20090901083826.342231003@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-04sched: Scale down cpu_power due to RT tasksPeter Zijlstra
Keep an average on the amount of time spend on RT tasks and use that fraction to scale down the cpu_power for regular tasks. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Gautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <20090901083826.287778431@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-04sched: Implement dynamic cpu_powerPeter Zijlstra
Recompute the cpu_power for each cpu during load-balance. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Gautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <20090901083826.162033479@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-04sched: Add smt_gainPeter Zijlstra
The idea is that multi-threading a core yields more work capacity than a single thread, provide a way to express a static gain for threads. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Gautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <20090901083826.073345955@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-04sched: Update the cpu_power sum during load-balancePeter Zijlstra
In order to prepare for a more dynamic cpu_power, update the group sum while walking the sched domains during load-balance. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Gautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <20090901083825.985050292@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-04sched: Add SD_PREFER_SIBLINGPeter Zijlstra
Do the placement thing using SD flags. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Gautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <20090901083825.897028974@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-04sched: Restore __cpu_power to a straight sum of powerPeter Zijlstra
cpu_power is supposed to be a representation of the process capacity of the cpu, not a value to randomly tweak in order to affect placement. Remove the placement hacks. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Gautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <20090901083825.810860576@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-04Merge branches 'sched/domains' and 'sched/clock' into sched/coreIngo Molnar
Merge reason: both topics are ready now, and we want to merge dependent changes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-02sched: Add wait, sleep and iowait accounting tracepointsPeter Zijlstra
Add 3 schedstat tracepoints to help account for wait-time, sleep-time and iowait-time. They can also be used as a perf-counter source to profile tasks on these clocks. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arjan van de Ven <arjan@linux.intel.com> LKML-Reference: <new-submission> [ build fix for the !CONFIG_SCHEDSTATS case ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-02sched: Provide iowait countersArjan van de Ven
For counting how long an application has been waiting for (disk) IO, there currently is only the HZ sample driven information available, while for all other counters in this class, a high resolution version is available via CONFIG_SCHEDSTATS. In order to make an improved bootchart tool possible, we also need a higher resolution version of the iowait time. This patch below adds this scheduler statistic to the kernel. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4A64B813.1080506@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-02Merge commit 'v2.6.31-rc8' into sched/coreIngo Molnar
Merge reason: bump from rc5 to rc8, but also pick up TP_perf_assign() API, a patch will be queued that depends on it. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-29sched: Rename init_cfs_rq => init_tg_cfs_rqAnirban Sinha
... so that it does not share a common name with a function within the same scope. Signed-off-by: Anirban Sinha <asinha@zeugmasystems.com> LKML-Reference: <DDFD17CC94A9BD49A82147DDF7D545C501EA98A6@exchange.ZeugmaSystems.local> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-28sched: Fix division by zero - reallyPeter Zijlstra
When re-computing the shares for each task group's cpu representation we need the ratio of weight on each cpu vs the total weight of the sched domain. Since load-balancing is loosely (read not) synchronized, the weight of individual cpus can change between doing the sum and calculating the ratio. The previous patch dealt with only one of the race scenarios, this patch side steps them all by saving a snapshot of all the individual cpu weights, thereby always working on a consistent set. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: torvalds@linux-foundation.org Cc: jes@sgi.com Cc: jens.axboe@oracle.com Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <1251371336.18584.77.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-27Linux 2.6.31-rc8Linus Torvalds
2009-08-27module: workaround duplicate section namesJames Bottomley
The root cause is a duplicate section name (.text); is this legal? [ Amerigo Wang: "AFAIK, yes." ] However, there's a problem with commit 6d76013381ed28979cd122eb4b249a88b5e384fa in that if you fail to allocate a mod->sect_attrs (in this case it's null because of the duplication), it still gets used without checking in add_notes_attrs() This should fix it [ This patch leaves other problems, particularly the sections directory, but recent parisc toolchains seem to produce these modules and this prevents a crash and is a minimal change -- RR ] Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Tested-by: Helge Deller <deller@gmx.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-27module: fix BUG_ON() for powerpc (and other function descriptor archs)Rusty Russell
The rarely-used symbol_put_addr() needs to use dereference_function_descriptor on powerpc. Reported-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-27xenfb: connect to backend before registering fbJeremy Fitzhardinge
As soon as the framebuffer is registered, our methods may be called by the kernel. This leads to a crash as xenfb_refresh() gets called before we have the irq. Connect to the backend before registering our framebuffer with the kernel. [ Fixes bug http://bugzilla.kernel.org/show_bug.cgi?id=14059 ] Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-27Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notifyLinus Torvalds
* 'for-linus' of git://git.infradead.org/users/eparis/notify: inotify: Ensure we alwasy write the terminating NULL. inotify: fix locking around inotify watching in the idr inotify: do not BUG on idr entries at inotify destruction inotify: seperate new watch creation updating existing watches
2009-08-27lmb: Remove __init from lmb_end_of_DRAM()Benjamin Herrenschmidt
We call lmb_end_of_DRAM() to test whether a DMA mask is ok on a machine without IOMMU, but this function is marked as __init. I don't think there's a clean way to get the top of RAM max_pfn doesn't appear to include highmem or I missed (or we have a bug :-) so for now, let's just avoid having a broken 2.6.31 by making this function non-__init and we can revisit later. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-27Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: 9p: update documentation pointers 9p: remove unnecessary v9fses->options which duplicates the mount string net/9p: insulate the client against an invalid error code sent by a 9p server 9p: Add missing cast for the error return value in v9fs_get_inode 9p: Remove redundant inode uid/gid assignment 9p: Fix possible regressions when ->get_sb fails. 9p: Fix v9fs show_options 9p: Fix possible memleak in v9fs_inode_from fid. 9p: minor comment fixes 9p: Fix possible inode leak in v9fs_get_inode. 9p: Check for error in return value of v9fs_fid_add
2009-08-27ipv4: make ip_append_data() handle NULL routing tableJulien TINNES
Add a check in ip_append_data() for NULL *rtp to prevent future bugs in callers from being exploitable. Signed-off-by: Julien Tinnes <julien@cr0.org> Signed-off-by: Tavis Ormandy <taviso@sdf.lonestar.org> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-27AFS: Stop readlink() on AFS crashing due to NULL 'file' ptrDavid Howells
kAFS crashes when asked to read a symbolic link because page_getlink() passes a NULL file pointer to read_mapping_page(), but afs_readpage() expects a file pointer from which to extract a key. Modify afs_readpage() to request the appropriate key from the calling process's keyrings if a file struct is not supplied with one attached. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Anton Blanchard <anton@samba.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-27init: Move sched_clock_init after late_time_initThomas Gleixner
Some architectures initialize clocks and timers in late_time_init and x86 wants to do the same to avoid FIXMAP hackery for calibrating the TSC. That would result in undefined sched_clock readout and wreckaged printk timestamps again. We probably have those already on archs which do all their time/clock setup in late_time_init. There is no harm to move that after late_time_init except that a few more boot timestamps are stale. The scheduler is not active at that point so no real wreckage is expected. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <new-submission> Cc: linux-arch@vger.kernel.org
2009-08-27inotify: Ensure we alwasy write the terminating NULL.Eric W. Biederman
Before the rewrite copy_event_to_user always wrote a terqminating '\0' byte to user space after the filename. Since the rewrite that terminating byte was skipped if your filename is exactly a multiple of event_size. Ouch! So add one byte to name_size before we round up and use clear_user to set userspace to zero like /dev/zero does instead of copying the strange nul_inotify_event. I can't quite convince myself len_to_zero will never exceed 16 and even if it doesn't clear_user should be more efficient and a more accurate reflection of what the code is trying to do. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2009-08-27inotify: fix locking around inotify watching in the idrEric Paris
The are races around the idr storage of inotify watches. It's possible that a watch could be found from sys_inotify_rm_watch() in the idr, but it could be removed from the idr before that code does it's removal. Move the locking and the refcnt'ing so that these have to happen atomically. Signed-off-by: Eric Paris <eparis@redhat.com>
2009-08-27inotify: do not BUG on idr entries at inotify destructionEric Paris
If an inotify watch is left in the idr when an fsnotify group is destroyed this will lead to a BUG. This is not a dangerous situation and really indicates a programming bug and leak of memory. This patch changes it to use a WARN and a printk rather than killing people's boxes. Signed-off-by: Eric Paris <eparis@redhat.com>
2009-08-27inotify: seperate new watch creation updating existing watchesEric Paris
There is nothing known wrong with the inotify watch addition/modification but this patch seperates the two code paths to make them each easy to verify as correct. Signed-off-by: Eric Paris <eparis@redhat.com>
2009-08-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: virtio: net refill on out-of-memory smc91x: fix compilation on SMP
2009-08-26Merge branch 'merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/ps3: Update ps3_defconfig powerpc/ps3: Add missing check for PS3 to rtc-ps3 platform device registration
2009-08-27powerpc/ps3: Update ps3_defconfigGeoff Levand
Update ps3_defconfig. o Refresh for 2.6.31. o Remove MTD support. o Add more HID drivers. Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-27powerpc/ps3: Add missing check for PS3 to rtc-ps3 platform device registrationGeert Uytterhoeven
On non-PS3, we get: | kernel BUG at drivers/rtc/rtc-ps3.c:36! because the rtc-ps3 platform device is registered unconditionally in a kernel with builtin support for PS3. Reported-by: Sachin Sant <sachinp@in.ibm.com> Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Acked-by: Geoff Levand <geoffrey.levand@am.sony.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-26Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: IMA: iint put in ima_counts_get and put
2009-08-26Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k,m68knommu: Wire up rt_tgsigqueueinfo and perf_counter_open m68k: Fix redefinition of pgprot_noncached arch/m68k/include/asm/motorola_pgalloc.h: fix kunmap arg m68k: cnt reaches -1, not 0 m68k: count can reach 51, not 50
2009-08-26leds: after setting inverted attribute, we must update the LEDThadeu Lima de Souza Cascardo
If we change the inverted attribute to another value, the LED will not be inverted until we change the GPIO state. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com> Cc: Samuel R. C. Vale <srcvale@holoscopio.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-26leds: fix multiple requests and releases of IRQ for GPIO LED TriggerThadeu Lima de Souza Cascardo
When setting the same GPIO number, multiple IRQ shared requests will be done without freing the previous request. It will also try to free a failed request or an already freed IRQ if 0 was written to the gpio file. All these oops and leaks were fixed with the following solution: keep the previous allocated GPIO (if any) still allocated in case the new request fails. The alternative solution would desallocate the previous allocated GPIO and set gpio as 0. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com> Signed-off-by: Samuel R. C. Vale <srcvale@holoscopio.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-26acpi processor: remove superfluous warning messageFrans Pop
This failure is very common on many platforms. Handling it in the ACPI processor driver is enough, and we don't need a warning message unless CONFIG_ACPI_DEBUG is set. Based on a patch from Zhang Rui. Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13389 Signed-off-by: Frans Pop <elendil@planet.nl> Acked-by: Zhang Rui <rui.zhang@intel.com> Cc: Len Brown <lenb@kernel.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-26ACPI processor: force throttling state when BIOS returns incorrect valueFrans Pop
If the BIOS reports an invalid throttling state (which seems to be fairly common after system boot), a reset is done to state T0. Because of a check in acpi_processor_get_throttling_ptc(), the reset never actually gets executed, which results in the error reoccurring on every access of for example /proc/acpi/processor/CPU0/throttling. Add a 'force' option to acpi_processor_set_throttling() to ensure the reset really takes effect. Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13389 This patch, together with the next one, fixes a regression introduced in 2.6.30, listed on the regression list. They have been available for 2.5 months now in bugzilla, but have not been picked up, despite various reminders and without any reason given. Google shows that numerous people are hitting this issue. The issue is in itself relatively minor, but the bug in the code is clear. The patches have been in all my kernels and today testing has shown that throttling works correctly with the patches applied when the system overheats (http://bugzilla.kernel.org/show_bug.cgi?id=13918#c14). Signed-off-by: Frans Pop <elendil@planet.nl> Acked-by: Zhang Rui <rui.zhang@intel.com> Cc: Len Brown <lenb@kernel.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>