aboutsummaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2008-10-24sched: weaken sync hintMike Galbraith
Mysql+oltp and pgsql+oltp peaks are still shifted right. The below puts the peaks back to 1 client/server pair per core. Use the avg_overlap information to weaken the sync hint. Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-24sched: more accurate min_vruntime accountingPeter Zijlstra
Mike noticed the current min_vruntime tracking can go wrong and skip the current task. If the only remaining task in the tree is a nice 19 task with huge vruntime, new tasks will be inserted too far to the right too, causing some interactibity issues. min_vruntime can only change due to the leftmost entry disappearing (dequeue_entity()), or by the leftmost entry being incremented past the next entry, which elects a new leftmost (__update_curr()) Due to the current entry not being part of the actual tree, we have to compare the leftmost tree entry with the current entry, and take the leftmost of these two. So create a update_min_vruntime() function that takes computes the leftmost vruntime in the system (either tree of current) and increases the cfs_rq->min_vruntime if the computed value is larger than the previously found min_vruntime. And call this from the two sites we've identified that can change min_vruntime. Reported-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-24sched: fix a find_busiest_group bugletPeter Zijlstra
In one of the group load balancer patches: commit 408ed066b11cf9ee4536573b4269ee3613bd735e Author: Peter Zijlstra <a.p.zijlstra@chello.nl> Date: Fri Jun 27 13:41:28 2008 +0200 Subject: sched: hierarchical load vs find_busiest_group The following change: - if (max_load - this_load + SCHED_LOAD_SCALE_FUZZ >= + if (max_load - this_load + 2*busiest_load_per_task >= busiest_load_per_task * imbn) { made the condition always true, because imbn is [1,2]. Therefore, remove the 2*, and give the it a fair chance. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-24Merge commit 'v2.6.28-rc1' into sched/urgentIngo Molnar
2008-10-23Fix compile warning in kernel/params.cLinus Torvalds
Move free_module_param_attrs() into the CONFIG_MODULES section, since it's only used inside there. Thus avoiding the warning kernel/params.c:514: warning: 'free_module_param_attrs' defined but not used Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-23Merge branch 'proc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc * 'proc' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc: (35 commits) proc: remove fs/proc/proc_misc.c proc: move /proc/vmcore creation to fs/proc/vmcore.c proc: move pagecount stuff to fs/proc/page.c proc: move all /proc/kcore stuff to fs/proc/kcore.c proc: move /proc/schedstat boilerplate to kernel/sched_stats.h proc: move /proc/modules boilerplate to kernel/module.c proc: move /proc/diskstats boilerplate to block/genhd.c proc: move /proc/zoneinfo boilerplate to mm/vmstat.c proc: move /proc/vmstat boilerplate to mm/vmstat.c proc: move /proc/pagetypeinfo boilerplate to mm/vmstat.c proc: move /proc/buddyinfo boilerplate to mm/vmstat.c proc: move /proc/vmallocinfo to mm/vmalloc.c proc: move /proc/slabinfo boilerplate to mm/slub.c, mm/slab.c proc: move /proc/slab_allocators boilerplate to mm/slab.c proc: move /proc/interrupts boilerplate code to fs/proc/interrupts.c proc: move /proc/stat to fs/proc/stat.c proc: move rest of /proc/partitions code to block/genhd.c proc: move /proc/cpuinfo code to fs/proc/cpuinfo.c proc: move /proc/devices code to fs/proc/devices.c proc: move rest of /proc/locks to fs/locks.c ...
2008-10-23Merge branch 'v28-range-hrtimers-for-linus-v2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'v28-range-hrtimers-for-linus-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (37 commits) hrtimers: add missing docbook comments to struct hrtimer hrtimers: simplify hrtimer_peek_ahead_timers() hrtimers: fix docbook comments DECLARE_PER_CPU needs linux/percpu.h hrtimers: fix typo rangetimers: fix the bug reported by Ingo for real rangetimer: fix BUG_ON reported by Ingo rangetimer: fix x86 build failure for the !HRTIMERS case select: fix alpha OSF wrapper select: fix alpha OSF wrapper hrtimer: peek at the timer queue just before going idle hrtimer: make the futex() system call use the per process slack value hrtimer: make the nanosleep() syscall use the per process slack hrtimer: fix signed/unsigned bug in slack estimator hrtimer: show the timer ranges in /proc/timer_list hrtimer: incorporate feedback from Peter Zijlstra hrtimer: add a hrtimer_start_range() function hrtimer: another build fix hrtimer: fix build bug found by Ingo hrtimer: make select() and poll() use the hrtimer range feature ...
2008-10-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/viro/bdevLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/viro/bdev: (66 commits) [PATCH] kill the rest of struct file propagation in block ioctls [PATCH] get rid of struct file use in blkdev_ioctl() BLKBSZSET [PATCH] get rid of blkdev_locked_ioctl() [PATCH] get rid of blkdev_driver_ioctl() [PATCH] sanitize blkdev_get() and friends [PATCH] remember mode of reiserfs journal [PATCH] propagate mode through swsusp_close() [PATCH] propagate mode through open_bdev_excl/close_bdev_excl [PATCH] pass fmode_t to blkdev_put() [PATCH] kill the unused bsize on the send side of /dev/loop [PATCH] trim file propagation in block/compat_ioctl.c [PATCH] end of methods switch: remove the old ones [PATCH] switch sr [PATCH] switch sd [PATCH] switch ide-scsi [PATCH] switch tape_block [PATCH] switch dcssblk [PATCH] switch dasd [PATCH] switch mtd_blkdevs [PATCH] switch mmc ...
2008-10-23Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (46 commits) [PATCH] fs: add a sanity check in d_free [PATCH] i_version: remount support [patch] vfs: make security_inode_setattr() calling consistent [patch 1/3] FS_MBCACHE: don't needlessly make it built-in [PATCH] move executable checking into ->permission() [PATCH] fs/dcache.c: update comment of d_validate() [RFC PATCH] touch_mnt_namespace when the mount flags change [PATCH] reiserfs: add missing llseek method [PATCH] fix ->llseek for more directories [PATCH vfs-2.6 6/6] vfs: add LOOKUP_RENAME_TARGET intent [PATCH vfs-2.6 5/6] vfs: remove LOOKUP_PARENT from non LOOKUP_PARENT lookup [PATCH vfs-2.6 4/6] vfs: remove unnecessary fsnotify_d_instantiate() [PATCH vfs-2.6 3/6] vfs: add __d_instantiate() helper [PATCH vfs-2.6 2/6] vfs: add d_ancestor() [PATCH vfs-2.6 1/6] vfs: replace parent == dentry->d_parent by IS_ROOT() [PATCH] get rid of on-stack dentry in udf [PATCH 2/2] anondev: switch to IDA [PATCH 1/2] anondev: init IDR statically [JFFS2] Use d_splice_alias() not d_add() in jffs2_lookup() [PATCH] Optimise NFS readdir hack slightly. ...
2008-10-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linusLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: stop_machine: fix error code handling on multiple cpus stop_machine: use workqueues instead of kernel threads workqueue: introduce create_rt_workqueue Call init_workqueues before pre smp initcalls. Make panic= and panic_on_oops into core_params Make initcall_debug a core_param core_param() for genuinely core kernel parameters param: Fix duplicate module prefixes module: check kernel param length at compile time, not runtime Remove stop_machine during module load v2 module: simplify load_module.
2008-10-23Merge branch 'v28-timers-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'v28-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: NOHZ: fix thinko in the timer restart code path
2008-10-23Merge branch 'core-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: rcupdate: fix bug of rcu_barrier*() profiling: fix !procfs build Fixed trivial conflicts in 'include/linux/profile.h'
2008-10-23Merge branch 'sched-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: disable the hrtick for now sched: revert back to per-rq vruntime sched: fair scheduler should not resched rt tasks sched: optimize group load balancer sched: minor fast-path overhead reduction sched: fix the wrong mask_len, cleanup sched: kill unused scheduler decl. sched: fix the wrong mask_len sched: only update rq->clock while holding rq->lock
2008-10-23Merge branch 'irq-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: genirq: NULL struct irq_desc's member 'name' in dynamic_irq_cleanup() genirq: fix off by one and coding style genirq: fix set_irq_type() when recording trigger type
2008-10-23proc: move /proc/schedstat boilerplate to kernel/sched_stats.hAlexey Dobriyan
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23proc: move /proc/modules boilerplate to kernel/module.cAlexey Dobriyan
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23proc: move /proc/execdomains to kernel/exec_domain.cAlexey Dobriyan
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23[PATCH] get rid of nameidata in audit_treeAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-22sched: add CONFIG_SMP consistencyLi Zefan
a patch from Henrik Austad did this: >> Do not declare select_task_rq as part of sched_class when CONFIG_SMP is >> not set. Peter observed: > While a proper cleanup, could you do it by re-arranging the methods so > as to not create an additional ifdef? Do not declare select_task_rq and some other methods as part of sched_class when CONFIG_SMP is not set. Also gather those methods to avoid CONFIG_SMP mess. Idea-by: Henrik Austad <henrik.austad@gmail.com> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Henrik Austad <henrik@austad.us> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-22Merge branch 'timers/range-hrtimers' into v28-range-hrtimers-for-linus-v2Thomas Gleixner
Conflicts: kernel/time/tick-sched.c Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-10-22stop_machine: fix error code handling on multiple cpusHeiko Carstens
Using |= for updating a value which might be updated on several cpus concurrently will not always work since we need to make sure that the update happens atomically. To fix this just use a write if the called function returns an error code on a cpu. We end up writing the error code of an arbitrary cpu if multiple ones fail but that should be sufficient. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-10-22stop_machine: use workqueues instead of kernel threadsHeiko Carstens
Convert stop_machine to a workqueue based approach. Instead of using kernel threads for stop_machine we now use a an rt workqueue to synchronize all cpus. This has the advantage that all needed per cpu threads are already created when stop_machine gets called. And therefore a call to stop_machine won't fail anymore. This is needed for s390 which needs a mechanism to synchronize all cpus without allocating any memory. As Rusty pointed out free_module() needs a non-failing stop_machine interface as well. As a side effect the stop_machine code gets simplified. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-10-22workqueue: introduce create_rt_workqueueHeiko Carstens
create_rt_workqueue will create a real time prioritized workqueue. This is needed for the conversion of stop_machine to a workqueue based implementation. This patch adds yet another parameter to __create_workqueue_key to tell it that we want an rt workqueue. However it looks like we rather should have something like "int type" instead of singlethread, freezable and rt. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Ingo Molnar <mingo@elte.hu>
2008-10-22Make panic= and panic_on_oops into core_paramsRusty Russell
This allows them to be examined and set after boot, plus means they actually give errors if they are misused (eg. panic=yes). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-10-22core_param() for genuinely core kernel parametersRusty Russell
There are a lot of one-liner uses of __setup() in the kernel: they're cumbersome and not queryable (definitely not settable) via /sys. Yet it's ugly to simplify them to module_param(), because by default that inserts a prefix of the module name (usually filename). So, introduce a "core_param". The parameter gets no prefix, but appears in /sys/module/kernel/parameters/ (if non-zero perms arg). I thought about using the name "core", but that's more common than "kernel". And if you create a module called "kernel", you will die a horrible death. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-10-22param: Fix duplicate module prefixesRusty Russell
Instead of insisting each new module_param sysfs entry is unique, handle the case where it already exists (for builtin modules). The current code assumes that all identical prefixes are together in the section: true for normal uses, but not necessarily so if someone overrides MODULE_PARAM_PREFIX. More importantly, it's not true with the new "core_param()" code which uses "kernel" as a prefix. This simplifies the caller for the builtin case, at a slight loss of efficiency (we do the lookup every time to see if the directory exists). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Greg Kroah-Hartman <gregkh@suse.de>
2008-10-22module: check kernel param length at compile time, not runtimeRusty Russell
The kparam code tries to handle over-length parameter prefixes at runtime. Not only would I bet this has never been tested, it's not clear that truncating names is a good idea either. So let's check at compile time. We need to move the #define to moduleparam.h to do this, though. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-10-22Remove stop_machine during module load v2Andi Kleen
Remove stop_machine during module load v2 module loading currently does a stop_machine on each module load to insert the module into the global module lists. Especially on larger systems this can be quite expensive. It does that to handle concurrent lock lessmodule list readers like kallsyms. I don't think stop_machine() is actually needed to insert something into a list though. There are no concurrent writers because the module mutex is taken. And the RCU list functions know how to insert a node into a list with the right memory ordering so that concurrent readers don't go off into the wood. So remove the stop_machine for the module list insert and just do a list_add_rcu() instead. Module removal will still do a stop_machine of course, it needs that for other reasons. v2: Revised readers based on Paul's comments. All readers that only rely on disabled preemption need to be changed to list_for_each_rcu(). Done that. The others are ok because they have the modules mutex. Also added a possible missing preempt disable for print_modules(). [cc Paul McKenney for review. It's not RCU, but quite similar.] Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-10-22module: simplify load_module.Rusty Russell
Linus' recent catch of stack overflow in load_module lead me to look at the code. A couple of helpers to get a section address and get objects from a section can help clean things up a little. (And in case you're wondering, the stack size also dropped from 328 to 284 bytes). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-10-21NOHZ: fix thinko in the timer restart code pathThomas Gleixner
commit fb02fbc14d17837b4b7b02dbb36142c16a7bf208 (NOHZ: restart tick device from irq_enter()) solves the problem of stale jiffies when long running softirqs happen in a long idle sleep period, but it has a major thinko in it: When the interrupt which came in _is_ the timer interrupt which should expire ts->sched_timer then we cancel and rearm the timer _before_ it gets expired in hrtimer_interrupt() to the next period. That means the call back function is not called. This game can go on for ever :( Prevent this by making sure to only rearm the timer when the expiry time is more than one tick_period away. Otherwise keep it running as it is either already expired or will expiry at the right point to update jiffies. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Venkatesch Pallipadi <venkatesh.pallipadi@intel.com>
2008-10-21rcupdate: fix bug of rcu_barrier*()Lai Jiangshan
current rcu_barrier_bh() is like this: void rcu_barrier_bh(void) { BUG_ON(in_interrupt()); /* Take cpucontrol mutex to protect against CPU hotplug */ mutex_lock(&rcu_barrier_mutex); init_completion(&rcu_barrier_completion); atomic_set(&rcu_barrier_cpu_count, 0); /* * The queueing of callbacks in all CPUs must be atomic with * respect to RCU, otherwise one CPU may queue a callback, * wait for a grace period, decrement barrier count and call * complete(), while other CPUs have not yet queued anything. * So, we need to make sure that grace periods cannot complete * until all the callbacks are queued. */ rcu_read_lock(); on_each_cpu(rcu_barrier_func, (void *)RCU_BARRIER_BH, 1); rcu_read_unlock(); wait_for_completion(&rcu_barrier_completion); mutex_unlock(&rcu_barrier_mutex); } The inconsistency of the code and the comments show a bug here. rcu_read_lock() cannot make sure that "grace periods for RCU_BH cannot complete until all the callbacks are queued". it only make sure that race periods for RCU cannot complete until all the callbacks are queued. so we must use rcu_read_lock_bh() for rcu_barrier_bh(). like this: void rcu_barrier_bh(void) { ...... rcu_read_lock_bh(); on_each_cpu(rcu_barrier_func, (void *)RCU_BARRIER_BH, 1); rcu_read_unlock_bh(); ...... } and also rcu_barrier() rcu_barrier_sched() are implemented like this. it will bring a lot of duplicate code. My patch uses another way to fix this bug, please see the comment of my patch. Thank Paul E. McKenney for he rewrote the comment. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-21genirq: NULL struct irq_desc's member 'name' in dynamic_irq_cleanup()Dean Nelson
If the member 'name' of the irq_desc structure happens to point to a character string that is resident within a kernel module, problems ensue if that module is rmmod'd (at which time dynamic_irq_cleanup() is called) and then later show_interrupts() is called by someone. It is also not a good thing if the character string resided in kmalloc'd space that has been kfree'd (after having called dynamic_irq_cleanup()). dynamic_irq_cleanup() fails to NULL the 'name' member and show_interrupts() references it on a few architectures (like h8300, sh and x86). Signed-off-by: Dean Nelson <dcn@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-21[PATCH] sanitize blkdev_get() and friendsAl Viro
* get rid of fake struct file/struct dentry in __blkdev_get() * merge __blkdev_get() and do_open() * get rid of flags argument of blkdev_get() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21[PATCH] propagate mode through swsusp_close()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21[PATCH] pass fmode_t to blkdev_put()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21genirq: fix set_irq_type() when recording trigger typeChris Friesen
Impact: fix boot hang on a G5 In set_irq_type() we want to pass the type rather than the current interrupt state. Signed-off-by: Chris Friesen <cfriesen@nortel.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-20kexec: fix crash_save_vmcoreinfo_init build problemLuck, Tony
This fixes kernel/kexec.c: In function 'crash_save_vmcoreinfo_init': kernel/kexec.c:1374: error: 'vmlist' undeclared (first use in this function) kernel/kexec.c:1374: error: (Each undeclared identifier is reported only once kernel/kexec.c:1374: error: for each function it appears in.) kernel/kexec.c:1410: error: invalid use of undefined type 'struct vm_struct' make[1]: *** [kernel/kexec.o] Error 1 Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20Merge branch 'tracing-v28-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (131 commits) tracing/fastboot: improve help text tracing/stacktrace: improve help text tracing/fastboot: fix initcalls disposition in bootgraph.pl tracing/fastboot: fix bootgraph.pl initcall name regexp tracing/fastboot: fix issues and improve output of bootgraph.pl tracepoints: synchronize unregister static inline tracepoints: tracepoint_synchronize_unregister() ftrace: make ftrace_test_p6nop disassembler-friendly markers: fix synchronize marker unregister static inline tracing/fastboot: add better resolution to initcall debug/tracing trace: add build-time check to avoid overrunning hex buffer ftrace: fix hex output mode of ftrace tracing/fastboot: fix initcalls disposition in bootgraph.pl tracing/fastboot: fix printk format typo in boot tracer ftrace: return an error when setting a nonexistent tracer ftrace: make some tracers reentrant ring-buffer: make reentrant ring-buffer: move page indexes into page headers tracing/fastboot: only trace non-module initcalls ftrace: move pc counter in irqtrace ... Manually fix conflicts: - init/main.c: initcall tracing - kernel/module.c: verbose level vs tracepoints - scripts/bootgraph.pl: fallout from cherry-picking commits.
2008-10-20Merge branch 'genirq-v28-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip This merges branches irq/genirq, irq/sparseirq-v4, timers/hpet-percpu and x86/uv. The sparseirq branch is just preliminary groundwork: no sparse IRQs are actually implemented by this tree anymore - just the new APIs are added while keeping the old way intact as well (the new APIs map 1:1 to irq_desc[]). The 'real' sparse IRQ support will then be a relatively small patch ontop of this - with a v2.6.29 merge target. * 'genirq-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (178 commits) genirq: improve include files intr_remapping: fix typo io_apic: make irq_mis_count available on 64-bit too genirq: fix name space collisions of nr_irqs in arch/* genirq: fix name space collision of nr_irqs in autoprobe.c genirq: use iterators for irq_desc loops proc: fixup irq iterator genirq: add reverse iterator for irq_desc x86: move ack_bad_irq() to irq.c x86: unify show_interrupts() and proc helpers x86: cleanup show_interrupts genirq: cleanup the sparseirq modifications genirq: remove artifacts from sparseirq removal genirq: revert dynarray genirq: remove irq_to_desc_alloc genirq: remove sparse irq code genirq: use inline function for irq_to_desc genirq: consolidate nr_irqs and for_each_irq_desc() x86: remove sparse irq from Kconfig genirq: define nr_irqs for architectures with GENERIC_HARDIRQS=n ...
2008-10-20Merge branch 'v28-timers-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'v28-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (36 commits) fix documentation of sysrq-q really Fix documentation of sysrq-q timer_list: add base address to clock base timer_list: print cpu number of clockevents device timer_list: print real timer address NOHZ: restart tick device from irq_enter() NOHZ: split tick_nohz_restart_sched_tick() NOHZ: unify the nohz function calls in irq_enter() timers: fix itimer/many thread hang, fix timers: fix itimer/many thread hang, v3 ntp: improve adjtimex frequency rounding timekeeping: fix rounding problem during clock update ntp: let update_persistent_clock() sleep hrtimer: reorder struct hrtimer to save 8 bytes on 64bit builds posix-timers: lock_timer: make it readable posix-timers: lock_timer: kill the bogus ->it_id check posix-timers: kill ->it_sigev_signo and ->it_sigev_value posix-timers: sys_timer_create: cleanup the error handling posix-timers: move the initialization of timer->sigq from send to create path posix-timers: sys_timer_create: simplify and s/tasklist/rcu/ ... Fix trivial conflicts due to sysrq-q description clahes in Documentation/sysrq.txt and drivers/char/sysrq.c
2008-10-20byteorder: remove direct includes of linux/byteorder/swab[b].hHarvey Harrison
A consolidated implementation will provide this generically through asm/byteorder, remove direct includes to avoid breakage when the changeover to the new implementation occurs. This hunk was lost from commit 1d8cca44b6a244b7e378546d719041819049a0f9 ("byteorder: provide swabb.h generically in asm/byteorder.h") Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20byteorder: remove direct includes of linux/byteorder/swab[b].hHarvey Harrison
A consolidated implementation will provide this generically through asm/byteorder, remove direct includes to avoid breakage when the changeover to the new implementation occurs. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Acked-by: Mauro Carvalho Chehab <mchehab@infradead.org> Acked-by: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20kdump: add vmlist.addr to vmcoreinfo for x86 vmalloc translation.Ken'ichi Ohmichi
Add the symbols 'vmlist' and offset 'vm_struct.addr' to the vmcoreinfo[1] data for i386 vmalloc translation. makedumpfile[2] needs VMALLOC_START value for distinguishing a vmalloc address or not, because it should choose suitable translation method. If applying this patch, makedumpfile will be able to take VMALLOC_START value from 'vmlist.addr'. vmcoreinfo[1]: The vmcoreinfo data has the minimum debugging information only for dump filtering. makedumpfile[2] uses it to distinguish unnecessary pages and creates a small dumpfile. makedumpfile[2]: dump filtering command https://sourceforge.net/projects/makedumpfile/ Signed-off-by: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20kthread_bind: use wait_task_inactive(TASK_UNINTERRUPTIBLE)Oleg Nesterov
Now that wait_task_inactive(task, state) checks task->state == state, we can simplify the code and make this debugging check more robust. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20make ptrace_untrace() staticAdrian Bunk
ptrace_untrace() can now become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20cpuset: use seq_*mask_* to print masksLai Jiangshan
1) seq_file excepts that m->count == m->size when it's buf is full, so current code will causes bugs when buf is overflow. 2) There is not too good that cpuset accesses struct seq_file's fields directly. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Paul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20cpuset.c: remove extra variableRakib Mullick
Remove the use of int cpus_nonempty variable from 'update_flag' function. Signed-off-by: Md.Rakib H. Mullick <rakib.mullick@gmail.com> Acked-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20cgroups: convert tasks file to use a seq_file with shared pid arrayPaul Menage
Rather than pre-generating the entire text for the "tasks" file each time the file is opened, we instead just generate/update the array of process ids and use a seq_file to report these to userspace. All open file handles on the same "tasks" file can share a pid array, which may be updated any time that no thread is actively reading the array. By sharing the array, the potential for userspace to DoS the system by opening many handles on the same "tasks" file is removed. [Based on a patch by Lai Jiangshan, extended to use seq_file] Signed-off-by: Paul Menage <menage@google.com> Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20cgroups: fix probable race with put_css_set[_taskexit] and find_css_setLai Jiangshan
put_css_set_taskexit may be called when find_css_set is called on other cpu. And the race will occur: put_css_set_taskexit side find_css_set side | atomic_dec_and_test(&kref->refcount) | /* kref->refcount = 0 */ | .................................................................... | read_lock(&css_set_lock) | find_existing_css_set | get_css_set | read_unlock(&css_set_lock); .................................................................... __release_css_set | .................................................................... | /* use a released css_set */ | [put_css_set is the same. But in the current code, all put_css_set are put into cgroup mutex critical region as the same as find_css_set.] [akpm@linux-foundation.org: repair comments] [menage@google.com: eliminate race in css_set refcounting] Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20kernel/configs.c: remove useless commentsWANG Cong
These comments are useless, remove them. Signed-off-by: WANG Cong <wangcong@zeuux.org> Cc: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>