aboutsummaryrefslogtreecommitdiff
path: root/arch/ia64/kernel/process.c
AgeCommit message (Collapse)Author
2008-10-17Pull pv_ops-xen into release branchTony Luck
2008-10-17ia64: move function declaration, ia64_cpu_local_tick() from .c to .hIsaku Yamahata
eliminate the function declaration ia64_cpu_local_tick() in process.c by defining in arch/ia64/include/asm/timex.h The same function will be used in a different .c file later. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-10-06[IA64] utrace use generic trace hookShaohua Li
Make IA64 use generic trace hook in some paths. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-07-16ACPI : Create "idle=nomwait" bootparamZhao Yakui
"idle=nomwait" disables the use of the MWAIT instruction from both C1 (C1_FFH) and deeper (C2C3_FFH) C-states. When MWAIT is unavailable, the BIOS and OS generally negotiate to use the HALT instruction for C1, and use IO accesses for deeper C-states. This option is useful for power and performance comparisons, and also to work around BIOS bugs where broken MWAIT support is advertised. http://bugzilla.kernel.org/show_bug.cgi?id=10807 http://bugzilla.kernel.org/show_bug.cgi?id=10914 Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Signed-off-by: Li Shaohua <shaohua.li@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16ACPI: Create "idle=halt" bootparamZhao Yakui
"idle=halt" limits the idle loop to using the halt instruction. No MWAIT, no IO accesses, no C-states deeper than C1. If something is broken in the idle code, "idle=halt" is a less severe workaround than "idle=poll" which disables all power savings. Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-06-26smp_call_function: get rid of the unused nonatomic/retry argumentJens Axboe
It's never used and the comments refer to nonatomic and retry interchangably. So get rid of it. Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-05-14[IA64] fix interrupt masking for pending works on kernel leaveHidetoshi Seto
[Bug-fix for "[BUG?][2.6.25-mm1] sleeping during IRQ disabled"] This patch does: - enable interrupts before calling schedule() as same as others, ex. x86 - enable interrupts during ia64_do_signal() and ia64_sync_krbs() - do_notify_resume_user() is still called with interrupts disabled, since we can take short path of fsys_mode if-statement quickly. - pfm_handle_work() is also called with interrupts disabled, since it can deal interrupt mask within itself. - fix/add some comments/notes Reported-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-04-30signals: ia64 renumber TIF_RESTORE_SIGMASKakpm@linux-foundation.org
TIF_RESTORE_SIGMASK no longer needs to be in the _TIF_WORK_* masks. Those low bits are scarce. Renumber TIF_RESTORE_SIGMASK to free one up. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-12[IA64] use CORE_DUMP_USE_REGSETShaohua Li
After we have regset support, we can use CORE_DUMP_USE_REGSET. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-02-08[IA64] Simplify cpu_idle_waitTony Luck
This is just Venki's patch[*] for x86 ported to ia64. * http://marc.info/?l=linux-kernel&m=120249201318159&w=2 Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-02-08[IA64] Synchronize kernel RSE to user-space and backPetr Tesarik
This is base kernel patch for ptrace RSE bug. It's basically a backport from the utrace RSE patch I sent out several weeks ago. please review. when a thread is stopped (ptraced), debugger might change thread's user stack (change memory directly), and we must avoid the RSE stored in kernel to override user stack (user space's RSE is newer than kernel's in the case). To workaround the issue, we copy kernel RSE to user RSE before the task is stopped, so user RSE has updated data. we then copy user RSE to kernel after the task is resummed from traced stop and kernel will use the newer RSE to return to user. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Petr Tesarik <ptesarik@suse.cz> CC: Roland McGrath <roland@redhat.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-02-08[IA64] Rename TIF_PERFMON_WORK back to TIF_NOTIFY_RESUMEPetr Tesarik
Since the RSE synchronization will need a TIF_ flag, but all work-to-be-done bits are already used, so we have to multiplex TIF_NOTIFY_RESUME again. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Petr Tesarik <ptesarik@suse.cz> Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-12-19[IA64] set_thread_area fails in IA32 chrootIan Wienand
I tried to upgrade an IA32 chroot on my IA64 to a new glibc with TLS. It kept dying because set_thread_area was returning -ESRCH (bugs.debian.org/451939). I instrumented arch/ia64/ia32/sys_ia32.c:get_free_idx() and ended up seeing output like [pid] idx desc->a desc->b ----------------------------- [2710] 0 -> c6b0ffff 40dff31b [2710] 1 -> 0 0 [2710] 2 -> 0 0 [2710] 0 -> c6b0ffff 40dff31b [2710] 1 -> c6b0ffff 40dff31b [2710] 2 -> 0 0 [2711] 0 -> c6b0ffff 40dff31b [2711] 1 -> c6b0ffff 40dff31b [2711] 2 -> 48c0ffff 40dff317 which suggested to me that TLS pointers were surviving exec() calls, leading to GDT pointers filling up and the eventual failure of get_free_idx(). I think the solution is flushing the tls array on exec. Signed-Off-By: Ian Wienand <ianw@gelato.unsw.edu.au> Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-12-18[IA64] print kernel release in OOPS to make kerneloops.org happyLuck, Tony
The ia64 oops message doesn't include the kernel version, which makes it hard to automatically categorize oops messages scraped from mailing lists and bug databases. Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-10-19Use helpers to obtain task pid in printks (arch code)Alexey Dobriyan
One of the easiest things to isolate is the pid printed in kernel log. There was a patch, that made this for arch-independent code, this one makes so for arch/xxx files. It took some time to cross-compile it, but hopefully these are all the printks in arch code. Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-13[IA64] disable irq's and check need_resched before safe_haltDimitri Sivanich
While sending interrupts to a cpu to repeatedly wake a thread, on occasion that thread will take a full timer tick cycle (4002 usec in my case) to wakeup. The problem concerns a race condition in the code around the safe_halt() call in the default_idle() routine. Setting 'nohalt' on the kernel command line causes the long wakeups to disappear. void default_idle (void) { local_irq_enable(); while (!need_resched()) { --> if (can_do_pal_halt) --> safe_halt(); else A timer tick could arrive between the check for !need_resched and the actual call to safe_halt() (which does a pal call to PAL_HALT_LIGHT). By the time the timer tick completes, a thread that might now need to run could get held up for as long as a timer tick waiting for the halted cpu. I'm proposing that we disable irq's and check need_resched again before calling safe_halt(). Does anyone see any problem with this approach? Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-07-25[IA64] rename partial_pageakpm@linux-foundation.org
Jens has added a partial_page thing in splice whcih conflicts with the ia64 one. Rename ia64 out of the way. (ia64 chose poorly). Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-07-11[IA64] silence GCC ia64 unused variable warningsJes Sorensen
Tell GCC to stop spewing out unnecessary warnings for unused variables passed to functions as pointers for ia64 files. Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-05-22[IA64] Only unwind non-running tasks.Robin Holt
Unwinding a running task has proven problematic. In one instance, the running task was attempting to unwind itself and received an interrupt between when get_wchan allocated local variables on the stack and when unw_init_from_blocked_task was called which resulted in unw_init_frame_info to place this tasks task_struct pointer over the switch stack's ar_bspstore entry. Signed-off-by: Robin Holt <holt@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-05-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] wire up pselect, ppoll [IA64] Add TIF_RESTORE_SIGMASK [IA64] unwind did not work for processes born with CLONE_STOPPED [IA64] Optional method to purge the TLB on SN systems [IA64] SPIN_LOCK_UNLOCKED macro cleanup in arch/ia64 [IA64-SN2][KJ] mmtimer.c-kzalloc [IA64] fix stack alignment for ia32 signal handlers [IA64] - Altix: hotplug after intr redirect can crash system [IA64] save and restore cpus_allowed in cpu_idle_wait [IA64] Removal of percpu TR cleanup in kexec code [IA64] Fix some section mismatch errors
2007-05-08[IA64] Add TIF_RESTORE_SIGMASKAlexey Dobriyan
Preparation for pselect and ppoll. ia32 compat code not tested. :-( Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-05-08header cleaning: don't include smp_lock.h when not usedRandy Dunlap
Remove includes of <linux/smp_lock.h> where it is not used/needed. Suggested by Al Viro. Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc, sparc64, and arm (all 59 defconfigs). Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08move die notifier handling to common codeChristoph Hellwig
This patch moves the die notifier handling to common code. Previous various architectures had exactly the same code for it. Note that the new code is compiled unconditionally, this should be understood as an appel to the other architecture maintainer to implement support for it aswell (aka sprinkling a notify_die or two in the proper place) arm had a notifiy_die that did something totally different, I renamed it to arm_notify_die as part of the patch and made it static to the file it's declared and used at. avr32 used to pass slightly less information through this interface and I brought it into line with the other architectures. [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix vmalloc_sync_all bustage] [bryan.wu@analog.com: fix vmalloc_sync_all in nommu] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: <linux-arch@vger.kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Bryan Wu <bryan.wu@analog.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08[IA64] save and restore cpus_allowed in cpu_idle_waitSiddha, Suresh B
Save and Restore the task's cpus_allowed mask, across the set_cpus_allowed() in cpu_idle_wait(). Without this, we will endup corrupting task's cpu affinity. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05[IA64] kexec: Move machine_shutdown from machine_kexec.c to process.cHorms
This moves the ia64 implementation of machine_shutdown() from machine_kexec.c to process.c, which is in keeping with the implelmentation on other architectures, and seems like a much more appropriate home for it. Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-12-22[PATCH] sched: fix bad missed wakeups in the i386, x86_64, ia64, ACPI and ↵Ingo Molnar
APM idle code Fernando Lopez-Lezcano reported frequent scheduling latencies and audio xruns starting at the 2.6.18-rt kernel, and those problems persisted all until current -rt kernels. The latencies were serious and unjustified by system load, often in the milliseconds range. After a patient and heroic multi-month effort of Fernando, where he tested dozens of kernels, tried various configs, boot options, test-patches of mine and provided latency traces of those incidents, the following 'smoking gun' trace was captured by him: _------=> CPU# / _-----=> irqs-off | / _----=> need-resched || / _---=> hardirq/softirq ||| / _--=> preempt-depth |||| / ||||| delay cmd pid ||||| time | caller \ / ||||| \ | / IRQ_19-1479 1D..1 0us : __trace_start_sched_wakeup (try_to_wake_up) IRQ_19-1479 1D..1 0us : __trace_start_sched_wakeup <<...>-5856> (37 0) IRQ_19-1479 1D..1 0us : __trace_start_sched_wakeup (c01262ba 0 0) IRQ_19-1479 1D..1 0us : resched_task (try_to_wake_up) IRQ_19-1479 1D..1 0us : __spin_unlock_irqrestore (try_to_wake_up) ... <idle>-0 1...1 11us!: default_idle (cpu_idle) ... <idle>-0 0Dn.1 602us : smp_apic_timer_interrupt (c0103baf 1 0) ... <...>-5856 0D..2 618us : __switch_to (__schedule) <...>-5856 0D..2 618us : __schedule <<idle>-0> (20 162) <...>-5856 0D..2 619us : __spin_unlock_irq (__schedule) <...>-5856 0...1 619us : trace_stop_sched_switched (__schedule) <...>-5856 0D..1 619us : trace_stop_sched_switched <<...>-5856> (37 0) what is visible in this trace is that CPU#1 ran try_to_wake_up() for PID:5856, it placed PID:5856 on CPU#0's runqueue and ran resched_task() for CPU#0. But it decided to not send an IPI that no CPU - due to TS_POLLING. But CPU#0 never woke up after its NEED_RESCHED bit was set, and only rescheduled to PID:5856 upon the next lapic timer IRQ. The result was a 600+ usecs latency and a missed wakeup! the bug turned out to be an idle-wakeup bug introduced into the mainline kernel this summer via an optimization in the x86_64 tree: commit 495ab9c045e1b0e5c82951b762257fe1c9d81564 Author: Andi Kleen <ak@suse.de> Date: Mon Jun 26 13:59:11 2006 +0200 [PATCH] i386/x86-64/ia64: Move polling flag into thread_info_status During some profiling I noticed that default_idle causes a lot of memory traffic. I think that is caused by the atomic operations to clear/set the polling flag in thread_info. There is actually no reason to make this atomic - only the idle thread does it to itself, other CPUs only read it. So I moved it into ti->status. the problem is this type of change: if (!hlt_counter && boot_cpu_data.hlt_works_ok) { - clear_thread_flag(TIF_POLLING_NRFLAG); + current_thread_info()->status &= ~TS_POLLING; smp_mb__after_clear_bit(); while (!need_resched()) { local_irq_disable(); this changes clear_thread_flag() to an explicit clearing of TS_POLLING. clear_thread_flag() is defined as: clear_bit(flag, &ti->flags); and clear_bit() is a LOCK-ed atomic instruction on all x86 platforms: static inline void clear_bit(int nr, volatile unsigned long * addr) { __asm__ __volatile__( LOCK_PREFIX "btrl %1,%0" hence smp_mb__after_clear_bit() is defined as a simple compile barrier: #define smp_mb__after_clear_bit() barrier() but the explicit TS_POLLING clearing introduced by the patch: + current_thread_info()->status &= ~TS_POLLING; is not an atomic op! So the clearing of the TS_POLLING bit is freely reorderable with the reading of the NEED_RESCHED bit - and both now reside in different memory addresses. CPU idle wakeup very much depends on ordered memory ops, the clearing of the TS_POLLING flag must always be done before we test need_resched() and hit the idle instruction(s). [Symmetrically, the wakeup code needs to set NEED_RESCHED before it tests the TS_POLLING flag, so memory ordering is paramount.] Fernando's dual-core Athlon64 system has a sufficiently advanced memory ordering model so that it triggered this scenario very often. ( And it also turned out that the reason why these latencies never triggered on my testsystems is that i routinely use idle=poll, which was the only idle variant not affected by this bug. ) The fix is to change the smp_mb__after_clear_bit() to an smp_mb(), to act as an absolute barrier between the TS_POLLING write and the NEED_RESCHED read. This affects almost all idling methods (default, ACPI, APM), on all 3 x86 architectures: i386, x86_64, ia64. Signed-off-by: Ingo Molnar <mingo@elte.hu> Tested-by: Fernando Lopez-Lezcano <nando@ccrma.Stanford.EDU> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] remove remaining errno and __KERNEL_SYSCALLS__ referencesArnd Bergmann
The last in-kernel user of errno is gone, so we should remove the definition and everything referring to it. This also removes the now-unused lib/execve.c file that was introduced earlier. Also remove every trace of __KERNEL_SYSCALLS__ that still remained in the kernel. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Andi Kleen <ak@muc.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Hirokazu Takata <takata.hirokazu@renesas.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Richard Curnow <rc@rc0.org.uk> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp> Cc: Chris Zankel <chris@zankel.net> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30Remove obsolete #include <linux/config.h>Jörn Engel
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-26[PATCH] i386/x86-64/ia64: Move polling flag into thread_info_statusAndi Kleen
During some profiling I noticed that default_idle causes a lot of memory traffic. I think that is caused by the atomic operations to clear/set the polling flag in thread_info. There is actually no reason to make this atomic - only the idle thread does it to itself, other CPUs only read it. So I moved it into ti->status. Converted i386/x86-64/ia64 for now because that was the easiest way to fix ACPI which also manipulates these flags in its idle function. Cc: Nick Piggin <npiggin@novell.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Len Brown <len.brown@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26[PATCH] kretprobe instance recycled by parent processbibo mao
When kretprobe probes the schedule() function, if the probed process exits then schedule() will never return, so some kretprobe instances will never be recycled. In this patch the parent process will recycle retprobe instances of the probed function and there will be no memory leak of kretprobe instances. Signed-off-by: bibo mao <bibo.mao@intel.com> Cc: Masami Hiramatsu <hiramatu@sdl.hitachi.co.jp> Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-12[PATCH] ia64: task_pt_regs()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-12-14[IA64] fix for SET_PERSONALITY when CONFIG_IA32_SUPPORT is not set.Robin Holt
Missed this when fixing the SET_PERSONALITY change. Signed-off-by: Robin Holt <holt@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-12-06[IA64] Change SET_PERSONALITY to comply with comment in binfmt_elf.c.Robin Holt
We have a customer application which trips a bug. The problem arises when a driver attempts to call do_munmap on an area which is mapped, but because current->thread.task_size has been set to 0xC0000000, the call to do_munmap fails thinking it is an unmap beyond the user's address space. The comment in fs/binfmt_elf.c in load_elf_library() before the call to SET_PERSONALITY() indicates that task_size must not be changed for the running application until flush_thread, but is for ia64 executing ia32 binaries. This patch moves the setting of task_size from SET_PERSONALITY() to flush_thread() as indicated. The customer application no longer is able to trip the bug. Signed-off-by: Robin Holt <holt@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-11-23[PATCH] kprobes: Fix return probes on sys_execveJim Keniston
Fix a bug in kprobes that can cause an Oops or even a crash when a return probe is installed on one of the following functions: sys_execve, do_execve, load_*_binary, flush_old_exec, or flush_thread. The fix is to remove the call to kprobe_flush_task() in flush_thread(). This fix has been tested on all architectures for which the return-probes feature has been implemented (i386, x86_64, ppc64, ia64). Please apply. BACKGROUND Up to now, we have called kprobe_flush_task() under two situations: when a task exits, and when it execs. Flushing kretprobe_instances on exit is correct because (a) do_exit() doesn't return, and (b) one or more return-probed functions may be active when a task calls do_exit(). Neither is the case for sys_execve() and its callees. Initially, the mistaken call to kprobe_flush_task() on exec was harmless because we put the "real" return address of each active probed function back in the stack, just to be safe, when we recycled its kretprobe_instance. When support for ppc64 and ia64 was added, this safety measure couldn't be employed, and was eventually dropped even for i386 and x86_64. sys_execve() and its callees were informally blacklisted for return probes until this fix was developed. Acked-by: Prasanna S Panchamukhi <prasanna@in.ibm.com> Signed-off-by: Jim Keniston <jkenisto@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-15[PATCH] ia64: cpu_idle performance bug fixChen, Kenneth W
Our performance validation on 2.6.15-rc1 caught a disastrous performance regression on ia64 with netperf (-98%) and volanomark (-58%) compares to previous kernel version 2.6.14-git7. See the following chart (result group 1 & 2). http://kernel-perf.sourceforge.net/results.machine_id=26.html We have root caused it to commit 64c7c8f88559624abdbe12b5da6502e8879f8d28 This changeset broke the ia64 task resched notification. In sched.c:resched_task(), a reschedule IPI is conditioned upon TIF_POLLING_NRFLAG. However, the above changeset unconditionally set the polling thread flag for idle tasks regardless whether pal_halt_light is in use or not. As a result, resched IPI is not sent from resched_task(). And since the default behavior on ia64 is to use pal_halt_light, we end up delaying the rescheduling task until next timer tick, and thus cause the performance regression. This fixes the performance bug. I'm glad our performance suite is turning up bad performance bug like this in time. Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-10Pull extend-notify-die into release branchTony Luck
2005-11-09[PATCH] sched: resched and cpu_idle reworkNick Piggin
Make some changes to the NEED_RESCHED and POLLING_NRFLAG to reduce confusion, and make their semantics rigid. Improves efficiency of resched_task and some cpu_idle routines. * In resched_task: - TIF_NEED_RESCHED is only cleared with the task's runqueue lock held, and as we hold it during resched_task, then there is no need for an atomic test and set there. The only other time this should be set is when the task's quantum expires, in the timer interrupt - this is protected against because the rq lock is irq-safe. - If TIF_NEED_RESCHED is set, then we don't need to do anything. It won't get unset until the task get's schedule()d off. - If we are running on the same CPU as the task we resched, then set TIF_NEED_RESCHED and no further action is required. - If we are running on another CPU, and TIF_POLLING_NRFLAG is *not* set after TIF_NEED_RESCHED has been set, then we need to send an IPI. Using these rules, we are able to remove the test and set operation in resched_task, and make clear the previously vague semantics of POLLING_NRFLAG. * In idle routines: - Enter cpu_idle with preempt disabled. When the need_resched() condition becomes true, explicitly call schedule(). This makes things a bit clearer (IMO), but haven't updated all architectures yet. - Many do a test and clear of TIF_NEED_RESCHED for some reason. According to the resched_task rules, this isn't needed (and actually breaks the assumption that TIF_NEED_RESCHED is only cleared with the runqueue lock held). So remove that. Generally one less locked memory op when switching to the idle thread. - Many idle routines clear TIF_POLLING_NRFLAG, and only set it in the inner most polling idle loops. The above resched_task semantics allow it to be set until before the last time need_resched() is checked before going into a halt requiring interrupt wakeup. Many idle routines simply never enter such a halt, and so POLLING_NRFLAG can be always left set, completely eliminating resched IPIs when rescheduling the idle task. POLLING_NRFLAG width can be increased, to reduce the chance of resched IPIs. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Con Kolivas <kernel@kolivas.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09[PATCH] sched: disable preempt in idle tasksNick Piggin
Run idle threads with preempt disabled. Also corrected a bugs in arm26's cpu_idle (make it actually call schedule()). How did it ever work before? Might fix the CPU hotplugging hang which Nigel Cunningham noted. We think the bug hits if the idle thread is preempted after checking need_resched() and before going to sleep, then the CPU offlined. After calling stop_machine_run, the CPU eventually returns from preemption and into the idle thread and goes to sleep. The CPU will continue executing previous idle and have no chance to call play_dead. By disabling preemption until we are ready to explicitly schedule, this bug is fixed and the idle threads generally become more robust. From: alexs <ashepard@u.washington.edu> PPC build fix From: Yoichi Yuasa <yuasa@hh.iij4u.or.jp> MIPS build fix Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Yoichi Yuasa <yuasa@hh.iij4u.or.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07[IA64] Extend notify_die() hooks for IA64Keith Owens
notify_die() added for MCA_{MONARCH,SLAVE,RENDEZVOUS}_{ENTER,PROCESS,LEAVE} and INIT_{MONARCH,SLAVE}_{ENTER,PROCESS,LEAVE}. We need multiple notification points for these events because they can take many seconds to run which has nasty effects on the behaviour of the rest of the system. DIE_SS replaced by a generic DIE_FAULT which checks the vector number, to allow interception of faults other than SS. DIE_MACHINE_{HALT,RESTART} added to allow last minute close down processing, especially when the halt/restart routines are called from error handlers. DIE_OOPS added. The check for kprobe's break numbers has been moved from traps.c to kprobes.c, allowing DIE_BREAK to be used for any additional break numbers, i.e. it is no longer kprobes specific. Hooks for kernel debuggers and kernel dumpers added, ENTER and LEAVE. Both of these disable the system for long periods which impact on watchdogs and heartbeat systems in general. More patches to come that use these events to reset watchdogs and heartbeats. unregister_die_notifier() added and both routines exported. Requested by Dean Nelson. Lock removed from {un,}register_die_notifier. notifier_chain_register() already takes a lock. Also the generic notifier chain locking is being reworked to distinguish between callbacks that can block and those that cannot, the lock in {un,}register_die_notifier would interfere with that change. http://marc.theaimsgroup.com/?l=linux-kernel&m=113018709002036&w=2 Leading white space removed from arch/ia64/kernel/kprobes.c. Typo in mca.c in original version of this patch found & fixed by Dean Nelson. Signed-off-by: Keith Owens <kaos@sgi.com> Acked-by: Dean Nelson <dcn@sgi.com> Acked-by: Anil Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-08-08[IA64] fix nohalt boot optionKen Chen
this changeset broke the "nohalt" kernel boot option. 8df5a500a3e97f7811cdce0f553ca1917ccd4220 default_idle() is looking at new variable can_do_pal_halt. However, that variable did not get cleared upon "nohalt" boot option. Result is that "nohalt" option is ignored until perfmon is exercised. Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-07-26[PATCH] Don't export machine_restart, machine_halt, or machine_power_off.Eric W. Biederman
machine_restart, machine_halt and machine_power_off are machine specific hooks deep into the reboot logic, that modules have no business messing with. Usually code should be calling kernel_restart, kernel_halt, kernel_power_off, or emergency_restart. So don't export machine_restart, machine_halt, and machine_power_off so we can catch buggy users. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12[ACPI] merge acpi-2.6.12 branch into latest Linux 2.6.13-rc...Len Brown
Signed-off-by: Len Brown <len.brown@intel.com>
2005-07-12[ACPI] fix C1 patch for IA64Venkatesh Pallipadi
http://bugzilla.kernel.org/show_bug.cgi?id=4233 Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2005-06-27[PATCH] Return probe redesign: ia64 specific implementationRusty Lynch
The following patch implements function return probes for ia64 using the revised design. With this new design we no longer need to do some of the odd hacks previous required on the last ia64 return probe port that I sent out for comments. Note that this new implementation still does not resolve the problem noted by Keith Owens where backtrace data is lost after a return probe is hit. Changes include: * Addition of kretprobe_trampoline to act as a dummy function for instrumented functions to return to, and for the return probe infrastructure to place a kprobe on on, gaining control so that the return probe handler can be called, and so that the instruction pointer can be moved back to the original return address. * Addition of arch_init(), allowing a kprobe to be registered on kretprobe_trampoline * Addition of trampoline_probe_handler() which is used as the pre_handler for the kprobe inserted on kretprobe_implementation. This is the function that handles the details for calling the return probe handler function and returning control back at the original return address * Addition of arch_prepare_kretprobe() which is setup as the pre_handler for a kprobe registered at the beginning of the target function by kernel/kprobes.c so that a return probe instance can be setup when a caller enters the target function. (A return probe instance contains all the needed information for trampoline_probe_handler to do it's job.) * Hooks added to the exit path of a task so that we can cleanup any left-over return probe instances (i.e. if a task dies while inside a targeted function then the return probe instance was reserved at the beginning of the function but the function never returns so we need to mark the instance as unused.) Signed-off-by: Rusty Lynch <rusty.lynch@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-03[IA64] Fix two warnings introduced by perfmon patches.Tony Luck
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-05-03[IA64] perfmon & PAL_HALT againStephane Eranian
The pmu_active test is based on the values of PSR.up. THIS IS THE PROBLEM as it does not take into account the lazy restore logic which is as follow (simplified): context switch out: save PMDs clear psr.up release ownership context switch in: if (ctx->last_cpu == smp_processor_id() && ctx->cpu_activation == cpu_activation) { set psr.up return } restore PMD restore PMC ctx->last_cpu = smp_processor_id(); ctx->activation = ++cpu_activation; set psr.up The key here is that on context switch out, we clear psr.up and on context switch in we check if nobody else used the PMU on that processor since last time we came. In that case, we assume the PMD/PMC are ours and we simply reactivate. The Caliper problem is that between the moment we context switch out and the moment we come back, nobody effectively used the PMU BUT the processor went idle. Normally this would have no incidence but PAL_HALT does alter the PMU registers. In default_idle(), the test on psr.up is not strong enough to cover this case and we go into PAL which trashed the PMU resgisters. When we come back we falsely assume that this is our state yet it is corrupted. Very nasty indeed. To avoid the problem it is necessary to forbid going to PAL_HALT as soon as perfmon installs some valid state in the PMU registers. This happens with an application attaches a context to a thread or CPU. It is not enough to check the psr/dcr bits. Hence I propose the attached patch. It adds a callback in process.c to modify the condition to enter PAL on idle. Basically, now it is conditional to pal_halt=1 AND perfmon saying it is okay. Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-05-03[IA64] reduce cacheline bouncing in cpu_idle_waitZwane Mwaikambo
Andi noted that during normal runtime cpu_idle_map is bounced around a lot, and occassionally at a higher frequency than the timer interrupt wakeup which we normally exit pm_idle from. So switch to a percpu variable. I didn't move things to the slow path because it would involve adding scheduler code to wakeup the idle thread on the cpus we're waiting for. Signed-off-by: Zwane Mwaikambo <zwane@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-04-22[IA64] cpu hotplug: return offlined cpus to SALAshok Raj
This patch is required to support cpu removal for IPF systems. Existing code just fakes the real offline by keeping it run the idle thread, and polling for the bit to re-appear in the cpu_state to get out of the idle loop. For the cpu-offline to work correctly, we need to pass control of this CPU back to SAL so it can continue in the boot-rendez mode. This gives the SAL control to not pick this cpu as the monarch processor for global MCA events, and addition does not wait for this cpu to checkin with SAL for global MCA events as well. The handoff is implemented as documented in SAL specification section 3.2.5.1 "OS_BOOT_RENDEZ to SAL return State" Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-04-16Linux-2.6.12-rc2Linus Torvalds
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!