aboutsummaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2007-08-01genirq: temporary fix for level-triggered IRQ resendThomas Gleixner
Marcin Slusarz reported a ne2k-pci "hung network interface" regression. delayed disable relies on the ability to re-trigger the interrupt in the case that a real interrupt happens after the software disable was set. In this case we actually disable the interrupt on the hardware level _after_ it occurred. On enable_irq, we need to re-trigger the interrupt. On i386 this relies on a hardware resend mechanism (send_IPI_self()). Actually we only need the resend for edge type interrupts. Level type interrupts come back once enable_irq() re-enables the interrupt line. I assume that the interrupt in question is level triggered because it is shared and above the legacy irqs 0-15: 17: 12 IO-APIC-fasteoi eth1, eth0 Looking into the IO_APIC code, the resend via send_IPI_self() happens unconditionally. So the resend is done for level and edge interrupts. This makes the problem more mysterious. The code in question lib8390.c does disable_irq(); fiddle_with_the_network_card_hardware() enable_irq(); The fiddle_with_the_network_card_hardware() might cause interrupts, which are cleared in the same code path again, Marcin found that when he disables the irq line on the hardware level (removing the delayed disable) the card is kept alive. So the difference is that we can get a resend on enable_irq, when an interrupt happens during the time, where we are in the disabled region. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-31Fix a use after free bug in kernel->userspace relay file supportJesper Juhl
Coverity spotted what looks like a real possible case of using a variable after it has been freed. The problem is in kernel/relay.c::relay_open_buf() If the code hits "goto free_buf;" it ends up in this code : free_buf: relay_destroy_buf(buf); <--- calls kfree() on 'buf'. free_name: kfree(tmpname); end: return buf; <-- use after free of 'buf'. I read through the callers and they all handle a NULL return from this function as an error (and hitting the 'free_buf' label only happens on failure to chan->cb->create_buf_file(), so that looks like a clear error to me). The patch simply sets 'buf' to NULL after the call to relay_destroy_buf(buf); - as far as I can see that should take care of the problem. The patch also corrects a reference to a documentation file while I was at it. Note from Mathieu: the documentation reference change should have been done in a separate patch, but I guess no one will really care. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Acked-by: "David J. Wilder" <wilder@us.ibm.com> Tested-by: "David J. Wilder" <wilder@us.ibm.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Tom Zanussi <zanussi@us.ibm.com> Cc: Karim Yaghmour <karim@opersys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-31kthread: silence bogus section mismatch warningSatyam Sharma
WARNING: kernel/built-in.o(.text+0x16910): Section mismatch: reference to .init.text: (between 'kthreadd' and 'init_waitqueue_head') comes because kernel/kthread.c:kthreadd() is not __init but calls kthreadd_setup() which is __init. But this is ok, because kthreadd_setup() is only ever called at init time, and then kthreadd() proceeds into its "for (;;)" loop. We could mark kthreadd __init_refok, but kthreadd_setup() with just one callsite and 4 lines in it (it's been that small since 10ab825bdef8df51) doesn't need to be a separate function at all -- so let's just move those four lines at beginning of kthreadd() itself. Signed-off-by: Satyam Sharma <ssatyam@cse.iitk.ac.in> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-31futex: pass nr_wake2 to futex_wake_opAndreas Schwab
The fourth argument of sys_futex is ignored when op == FUTEX_WAKE_OP, but futex_wake_op expects it as its nr_wake2 parameter. The only user of this operation in glibc is always passing 1, so this bug had no consequences so far. Signed-off-by: Andreas Schwab <schwab@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Ulrich Drepper <drepper@redhat.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-31Fix leak on /proc/lockdep_statsAlexey Dobriyan
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-31Fix leaks on /proc/{*/sched,sched_debug,timer_list,timer_stats}Alexey Dobriyan
On every open/close one struct seq_operations leaks. Kudos to /proc/slab_allocators. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-31sched: fix kernel-doc warningsRandy Dunlap
Fix kernel-doc warnings in sched.c: Warning(linux-2623-rc1g4//kernel/sched.c:1685): No description found for parameter 'notifier' Warning(linux-2623-rc1g4//kernel/sched.c:1696): No description found for parameter 'notifier' Warning(linux-2623-rc1g4//kernel/sched.c:1750): No description found for parameter 'prev' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-30modules: better error messages when modules fail to load due to a sysfs problem.Greg Kroah-Hartman
This helps people when debugging problems like the ones that were in the recent -mm releases. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-29ACPI: restore CONFIG_ACPI_SLEEPLen Brown
Restore the 2.6.22 CONFIG_ACPI_SLEEP build option, but now shadowing the new CONFIG_PM_SLEEP option. Signed-off-by: Len Brown <len.brown@intel.com> [ Modified to work with the PM config setup changes. ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-29Introduce CONFIG_SUSPEND for suspend-to-Ram and standbyRafael J. Wysocki
Introduce CONFIG_SUSPEND representing the ability to enter system sleep states, such as the ACPI S3 state, and allow the user to choose SUSPEND and HIBERNATION independently of each other. Make HOTPLUG_CPU be selected automatically if SUSPEND or HIBERNATION has been chosen and the kernel is intended for SMP systems. Also, introduce CONFIG_PM_SLEEP which is automatically selected if CONFIG_SUSPEND or CONFIG_HIBERNATION is set and use it to select the code needed for both suspend and hibernation. The top-level power management headers and the ACPI code related to suspend and hibernation are modified to use the new definitions (the changes in drivers/acpi/sleep/main.c are, mostly, moving code to reduce the number of ifdefs). There are many other files in which CONFIG_PM can be replaced with CONFIG_PM_SLEEP or even with CONFIG_SUSPEND, but they can be updated in the future. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-29Replace CONFIG_SOFTWARE_SUSPEND with CONFIG_HIBERNATIONRafael J. Wysocki
Replace CONFIG_SOFTWARE_SUSPEND with CONFIG_HIBERNATION to avoid confusion (among other things, with CONFIG_SUSPEND introduced in the next patch). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-28audit: fix two bugs in the new execve audit codePeter Zijlstra
copy_from_user() returns the number of bytes not copied, hence 0 is the expected output. axi->mm might not be valid anymore when not equal to current->mm, do not dereference before checking that - thanks to Al for spotting that. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Steve Grubb <sgrubb@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-28rip some includes from linux/interrupt.hAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-schedLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: [PATCH] sched: debug feature - make the sched-domains tree runtime-tweakable [PATCH] sched: add above_background_load() function [PATCH] sched: update Documentation/sched-stats.txt [PATCH] sched: mark sysrq_sched_debug_show() static [PATCH] sched: make cpu_clock() not use the rq clock [PATCH] sched: remove unused rq->load_balance_class [PATCH] sched: arch preempt notifier mechanism [PATCH] sched: increase SCHED_LOAD_SCALE_FUZZ
2007-07-26Fix ThinkPad T42 poweroff failure introduced by by "PM: Introduce ↵Rafael J. Wysocki
pm_power_off_prepare" Commit bd804eba1c8597cbb7cd5a5f9fe886aae16a079a ("PM: Introduce pm_power_off_prepare") caused problems in the poweroff path, as reported by YOSHIFUJI Hideaki / 吉藤英明. Generally, sysdev_shutdown() should be called after the ACPI preparation for powering the system off. To make it happen, we can separate sysdev_shutdown() from device_shutdown() and call it directly wherever necessary. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Tested-by: YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-26kernel-doc fix for kmod.cRandy Dunlap
Fix kmod.c: Warning(linux-2.6.23-rc1//kernel/kmod.c:364): No description found for parameter 'envp' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-26[PATCH] sched: debug feature - make the sched-domains tree runtime-tweakableNick Piggin
debugging feature: make the sched-domains tree runtime-tweakable. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ mingo@elte.hu: made it depend on CONFIG_SCHED_DEBUG & small updates ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-07-26[PATCH] sched: mark sysrq_sched_debug_show() staticJosh Triplett
Only sched.c uses sysrq_sched_debug_show, and sched.c includes sched_debug.c, so all uses of sysrq_sched_debug_show occur in the same source file. Eliminates a sparse warning: warning: symbol 'sysrq_sched_debug_show' was not declared. Should it be static? Signed-off-by: Josh Triplett <josh@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-07-26[PATCH] sched: make cpu_clock() not use the rq clockIngo Molnar
it is enough to disable interrupts to get the precise rq-clock of the local CPU. this also solves an NMI watchdog regression: the NMI watchdog calls touch_softlockup_watchdog(), which might deadlock on rq->lock if the NMI hits an rq-locked critical section. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-07-26[PATCH] sched: remove unused rq->load_balance_classSatoru Takeuchi
Remove unused rq->load_balance_class. Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-07-26[PATCH] sched: arch preempt notifier mechanismAvi Kivity
This adds a general mechanism whereby a task can request the scheduler to notify it whenever it is preempted or scheduled back in. This allows the task to swap any special-purpose registers like the fpu or Intel's VT registers. Signed-off-by: Avi Kivity <avi@qumranet.com> [ mingo@elte.hu: fixes, cleanups ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-07-25Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: ACPI: Kconfig: remove CONFIG_ACPI_SLEEP from source ACPI: quiet ACPI Exceptions due to no _PTC or _TSS ACPI: Remove references to ACPI_STATE_S2 from acpi_pm_enter ACPI: Kconfig: always enable CONFIG_ACPI_SLEEP on X86 ACPI: Kconfig: fold /proc/acpi/sleep under CONFIG_ACPI_PROCFS ACPI: Kconfig: CONFIG_ACPI_PROCFS now defaults to N ACPI: autoload modules - Create __mod_acpi_device_table symbol for all ACPI drivers ACPI: autoload modules - Create ACPI alias interface ACPI: autoload modules - ACPICA modifications ACPI: asus-laptop: Fix failure exits ACPI: fix oops due to typo in new throttling code ACPI: ignore _PSx method for hotplugable PCI devices ACPI: Use ACPI methods to select PCI device suspend state ACPI, PNP: hook ACPI D-state to PNP suspend/resume ACPI: Add acpi_pm_device_sleep_state helper routine ACPI: Implement the set_target() callback from pm_ops
2007-07-25Cache xtime every call to update_wall_timejohn stultz
This avoids xtime lag seen with dynticks, because while 'xtime' itself is still not updated often, we keep a 'xtime_cache' variable around that contains the approximate real-time that _is_ updated each time we do a 'update_wall_time()', and is thus never off by more than one tick. IOW, this restores the original semantics for 'xtime' users, as long as you use the proper abstraction functions (ie 'current_kernel_time()' or 'get_seconds()' depending on whether you want a timespec or just the seconds field). [ Updated Patch. As penance for my sins I've also yanked another #ifdef that was added to avoid the xtime lag w/ hrtimers. ] Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-25Cleanup non-arch xtime uses, use get_seconds() or current_kernel_time().john stultz
This avoids use of the kernel-internal "xtime" variable directly outside of the actual time-related functions. Instead, use the helper functions that we already have available to us. This doesn't actually change any behaviour, but this will allow us to fix the fact that "xtime" isn't updated very often with CONFIG_NO_HZ (because much of the realtime information is maintained as separate offsets to 'xtime'), which has caused interfaces that use xtime directly to get a time that is out of sync with the real-time clock by up to a third of a second or so. Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-25ACPI: Kconfig: remove CONFIG_ACPI_SLEEP from sourceLen Brown
As it was a synonym for (CONFIG_ACPI && CONFIG_X86), the ifdefs for it were more clutter than they were worth. For ia64, just add a few stubs in anticipation of future S3 or S4 support. Signed-off-by: Len Brown <len.brown@intel.com>
2007-07-22Merge branch 'audit.b39' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current * 'audit.b39' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: [PATCH] get rid of AVC_PATH postponed treatment [PATCH] allow audit filtering on bit & operations [PATCH] audit: fix broken class-based syscall audit [PATCH] Make IPC mode consistent
2007-07-22x86: i386-show-unhandled-signals-v3Masoud Asgharifard Sharbiani
This patch makes the i386 behave the same way that x86_64 does when a segfault happens. A line gets printed to the kernel log so that tools that need to check for failures can behave more uniformly between debug.show_unhandled_signals sysctl variable to 0 (or by doing echo 0 > /proc/sys/debug/exception-trace) Also, all of the lines being printed are now using printk_ratelimit() to deny the ability of DoS from a local user with a program like the following: main() { while (1) if (!fork()) *(int *)0 = 0; } This new revision also includes the fix that Andrew did which got rid of new sysctl that was added to the system in earlier versions of this. Also, 'show-unhandled-signals' sysctl has been renamed back to the old 'exception-trace' to avoid breakage of people's scripts. AK: Enabling by default for i386 will be likely controversal, but let's see what happens AK: Really folks, before complaining just fix your segfaults AK: I bet this will find a lot of silent issues Signed-off-by: Masoud Sharbiani <masouds@google.com> Signed-off-by: Andi Kleen <ak@suse.de> [ Personally, I've found the complaints useful on x86-64, so I'm all for this. That said, I wonder if we could do it more prettily.. -Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-22[PATCH] get rid of AVC_PATH postponed treatmentAl Viro
Selinux folks had been complaining about the lack of AVC_PATH records when audit is disabled. I must admit my stupidity - I assumed that avc_audit() really couldn't use audit_log_d_path() because of deadlocks (== could be called with dcache_lock or vfsmount_lock held). Shouldn't have made that assumption - it never gets called that way. It _is_ called under spinlocks, but not those. Since audit_log_d_path() uses ab->gfp_mask for allocations, kmalloc() in there is not a problem. IOW, the simple fix is sufficient: let's rip AUDIT_AVC_PATH out and simply generate pathname as part of main record. It's trivial to do. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: James Morris <jmorris@namei.org>
2007-07-22[PATCH] allow audit filtering on bit & operationsEric Paris
Right now the audit filter can match on = != > < >= blah blah blah. This allow the filter to also look at bitwise AND operations, & Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-07-22[PATCH] audit: fix broken class-based syscall auditKlaus Weidner
The sanity check in audit_match_class() is wrong. We are able to audit 2048 syscalls but in audit_match_class() we were accidentally using sizeof(_u32) instead of number of bits in _u32 when deciding how many syscalls were valid. On ia64 in particular we were hitting syscall numbers over the (wrong) limit of 256. Fixing the audit_match_class check takes care of the problem. Signed-off-by: Klaus Weidner <klaus@atsec.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-07-22[PATCH] Make IPC mode consistentSteve Grubb
The mode fields for IPC records are not consistent. Some are hex, others are octal. This patch makes them all octal. Signed-off-by: Steve Grubb <sgrubb@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-07-21x86: PM_TRACE supportNigel Cunningham
Signed-off-by: Nigel Cunningham <nigel@nigel.suspend2.net> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Pavel Machek <pavel@ucw.cz> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21x86_64: Report the pending irq if available in smp_affinityAndi Kleen
Otherwise smp_affinity would only update after the next interrupt on x86 systems. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21NTP: move the cmos update code into ntp.cThomas Gleixner
i386 and sparc64 have the identical code to update the cmos clock. Move it into kernel/time/ntp.c as there are other architectures coming along with the same requirements. [akpm@linux-foundation.org: build fixes] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: john stultz <johnstul@us.ibm.com> Cc: David Miller <davem@davemloft.net> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21hrtimer: speedup hrtimer_enqueueIngo Molnar
Speedup hrtimer_enqueue by evaluating the rbtree insertion result. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21highres: improve debug outputIngo Molnar
Add some more debug information to the hrtimer and clock events code. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21tick management: spread timer interruptjohn stultz
After discussing w/ Thomas over IRC, it seems the issue is the sched tick fires on every cpu at the same time, causing extra lock contention. This smaller change, adds an extra offset per cpu so the ticks don't line up. This patch also drops the idle latency from 40us down to under 20us. Signed-off-by: john stultz <johnstul@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21clockevents: fix device replacementThomas Gleixner
When a device is replaced by a better rated device, then the broadcast mode needs to be evaluated again. When the new device has no requirement for broadcasting, then the broadcast bits for the CPU must be cleared. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21clockevents: fix resume logicThomas Gleixner
We need to make sure, that the clockevent devices are resumed, before the tick is resumed. The current resume logic does not guarantee this. Add CLOCK_EVT_MODE_RESUME and call the set mode functions of the clock event devices before resuming the tick / oneshot functionality. Fixup the existing users. Thanks to Nigel Cunningham for tracking down a long standing thinko, which affected the jinxed VAIO. [akpm@linux-foundation.org: xen build fix] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-20Revert "sys_time() speedup"Linus Torvalds
This basically reverts commit 4e44f3497d41db4c3b9051c61410dee8ae4fb49c, while waiting for it to be re-done more completely. There are cases of people mixing "time()" with higher-resolution time sources, and we need to take the nanosecond offsets into account. Ingo has a patch that does that, but it's still under some discussion. In the meantime, just revert back to the old simple situation of just doing the whole exact timesource calculations. But rather than using do_gettimeofday(), use the internal nanosecond resolution getnstimeofday(), which at least avoids one unnecessary conversion (since we really don't care about whether the fractional seconds are nanoseconds or microseconds - we'll just throw them away). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-20Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] Prevent people from directly including <asm/rwsem.h>. [IA64] remove time interpolator [IA64] Convert to generic timekeeping/clocksource [IA64] refresh some config files for 64K pagesize [IA64] Delete iosapic_free_rte() [IA64] fallocate system call [IA64] Enable percpu vector domain for IA64_DIG [IA64] Enable percpu vector domain for IA64_GENERIC [IA64] Support irq migration across domain [IA64] Add support for vector domain [IA64] Add mapping table between irq and vector [IA64] Check if irq is sharable [IA64] Fix invalid irq vector assumption for iosapic [IA64] Use dynamic irq for iosapic interrupts [IA64] Use per iosapic lock for indirect iosapic register access [IA64] Cleanup lock order in iosapic_register_intr [IA64] Remove duplicated members in iosapic_rte_info [IA64] Remove block structure for locking in iosapic.c
2007-07-20FRV: Fix linkage problemsDavid Howells
Make it possible to use __start_notes and __stop_notes without getting a GPREL overflow error from the FRV linker. Small variables that would otherwise be in .data or .bss may, depending on the arch, be placed in special sections (.sdata or .sbss) that permit single instruction references on fixed instruction width machines. __start_notes and __stop_notes aren't really char variables, and certainly don't refer to data in .data or .bss. Making them type "void" fools the compiler into not assuming anything about them. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-20Pull ia64-clocksource into release branchTony Luck
2007-07-20[IA64] remove time interpolatorBob Picco
Remove time_interpolator code (This is generic code, but only user was ia64. It has been superseded by the CONFIG_GENERIC_TIME code). Signed-off-by: Bob Picco <bob.picco@hp.com> Signed-off-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: Peter Keilty <peter.keilty@hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-07-20mm: Remove slab destructors from kmem_cache_create().Paul Mundt
Slab destructors were no longer supported after Christoph's c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-07-19[PATCH] sched: implement cpu_clock(cpu) high-speed time sourceIngo Molnar
Implement the cpu_clock(cpu) interface for kernel-internal use: high-speed (but slightly incorrect) per-cpu clock constructed from sched_clock(). This API, unused at the moment, will be used in the future by blktrace, by the softlockup-watchdog, by printk and by lockstat. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-07-19[PATCH] sched: fix the all pinned logic in load_balance_newidle()Suresh Siddha
nr_moved is not the correct check for triggering all pinned logic. Fix the all pinned logic in the case of load_balance_newidle(). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-07-19[PATCH] sched: fix newly idle load balance in case of SMTSuresh Siddha
In the presence of SMT, newly idle balance was never happening for multi-core and SMP domains (even when both the logical siblings are idle). If thread 0 is already idle and when thread 1 is about to go to idle, newly idle load balance always think that one of the threads is not idle and skips doing the newly idle load balance for multi-core and SMP domains. This is because of the idle_cpu() macro, which checks if the current process on a cpu is an idle process. But this is not the case for the thread doing the load_balance_newidle(). Fix this by using runqueue's nr_running field instead of idle_cpu(). And also skip the logic of 'only one idle cpu in the group will be doing load balancing' during newly idle case. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-07-19kernel/sysctl.c: finish off the warning commentsAndrew Morton
I've been chasing these comments around this file all week. Hopefully we're straight now. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19lguest: the host codeRusty Russell
This is the code for the "lg.ko" module, which allows lguest guests to be launched. [akpm@linux-foundation.org: update for futex-new-private-futexes] [akpm@linux-foundation.org: build fix] [jmorris@namei.org: lguest: use hrtimers] [akpm@linux-foundation.org: x86_64 build fix] Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Andi Kleen <ak@suse.de> Cc: Eric Dumazet <dada1@cosmosbay.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>