Age | Commit message (Collapse) | Author |
|
asm/desc.h is included in three assembly files, but the only macro
it defines, GET_DESC_BASE, is never used. This patch removes the
includes, removes the macro GET_DESC_BASE and the ASSEMBLY guard
from asm/desc.h.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
|
The espfix code triggers if we have a protected mode userspace
application with a 16-bit stack. On returning to userspace, with iret,
the CPU doesn't restore the high word of the stack pointer. This is an
"official" bug, and the work-around used in the kernel is to temporarily
switch to a 32-bit stack segment/pointer pair where the high word of the
pointer is equal to the high word of the userspace stackpointer.
The current implementation uses THREAD_SIZE to determine the cut-off,
but there is no good reason not to use the more natural 64kb... However,
implementing this by simply substituting THREAD_SIZE with 65536 in
patch_espfix_desc crashed the test application. patch_espfix_desc tries
to do what is described above, but gets it subtly wrong if the userspace
stack pointer is just below a multiple of THREAD_SIZE: an overflow
occurs to bit 13... With a bit of luck, when the kernelspace
stackpointer is just below a 64kb-boundary, the overflow then ripples
trough to bit 16 and userspace will see its stack pointer changed by
65536.
This patch moves all espfix code into entry_32.S. Selecting a 16-bit
cut-off simplifies the code. The game with changing the limit dynamically
is removed too. It complicates matters and I see no value in it. Changing
only the top 16-bit word of ESP is one instruction and it also implies
that only two bytes of the ESPFIX GDT entry need to be changed and this
can be implemented in just a handful simple to understand instructions.
As a side effect, the operation to compute the original ESP from the
ESPFIX ESP and the GDT entry simplifies a bit too, and the remaining
three instructions have been expanded inline in entry_32.S.
impact: can now reliably run userspace with ESP=xxxxfffc on 16-bit
stack segment
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Acked-by: Stas Sergeev <stsp@aknet.ru>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
|
Returning to a task with a 16-bit stack requires special care: the iret
instruction does not restore the high word of esp in that case. The
espfix code fixes this, but currently is not invoked on NMIs. This means
that a running task gets the upper word of esp clobbered due intervening
NMIs. To reproduce, compile and run the following program with the nmi
watchdog enabled (nmi_watchdog=2 on the command line). Using gdb you can
see that the high bits of esp contain garbage, while the low bits are
still correct.
This patch puts the espfix code back into the NMI code path.
The patch is slightly complicated due to the irqtrace infrastructure not
being NMI-safe. The NMI return path cannot call TRACE_IRQS_IRET.
Otherwise, the tail of the normal iret-code is correct for the nmi code
path too. To be able to share this code-path, the TRACE_IRQS_IRET was
move up a bit. The espfix code exists after the TRACE_IRQS_IRET, but
this code explicitly disables interrupts. This short interrupts-off
section is now not traced anymore. The return-to-kernel path now always
includes the preliminary test to decide if the espfix code should be
called. This is never the case, but doing it this way keeps the patch as
simple as possible and the few extra instructions should not affect
timing in any significant way.
#define _GNU_SOURCE
#include <stdio.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <asm/ldt.h>
int modify_ldt(int func, void *ptr, unsigned long bytecount)
{
return syscall(SYS_modify_ldt, func, ptr, bytecount);
}
/* this is assumed to be usable */
#define SEGBASEADDR 0x10000
#define SEGLIMIT 0x20000
/* 16-bit segment */
struct user_desc desc = {
.entry_number = 0,
.base_addr = SEGBASEADDR,
.limit = SEGLIMIT,
.seg_32bit = 0,
.contents = 0, /* ??? */
.read_exec_only = 0,
.limit_in_pages = 0,
.seg_not_present = 0,
.useable = 1
};
int main(void)
{
setvbuf(stdout, NULL, _IONBF, 0);
/* map a 64 kb segment */
char *pointer = mmap((void *)SEGBASEADDR, SEGLIMIT+1,
PROT_EXEC|PROT_READ|PROT_WRITE,
MAP_SHARED|MAP_ANONYMOUS, -1, 0);
if (pointer == NULL) {
printf("could not map space\n");
return 0;
}
/* write ldt, new mode */
int err = modify_ldt(0x11, &desc, sizeof(desc));
if (err) {
printf("error modifying ldt: %i\n", err);
return 0;
}
for (int i=0; i<1000; i++) {
asm volatile (
"pusha\n\t"
"mov %ss, %eax\n\t" /* preserve ss:esp */
"mov %esp, %ebp\n\t"
"push $7\n\t" /* index 0, ldt, user mode */
"push $65536-4096\n\t" /* esp */
"lss (%esp), %esp\n\t" /* switch to new stack */
"push %eax\n\t" /* save old ss:esp on new stack */
"push %ebp\n\t"
"add $17*65536, %esp\n\t" /* set high bits */
"mov %esp, %edx\n\t"
"mov $10000000, %ecx\n\t" /* wait... */
"1: loop 1b\n\t" /* ... a bit */
"cmp %esp, %edx\n\t"
"je 1f\n\t"
"ud2\n\t" /* esp changed inexplicably! */
"1:\n\t"
"sub $17*65536, %esp\n\t" /* restore high bits */
"lss (%esp), %esp\n\t" /* restore old ss:esp */
"popa\n\t");
printf("\rx%ix", i);
}
return 0;
}
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Acked-by: Stas Sergeev <stsp@aknet.ru>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
|
Vegard Nossum reported:
[ 503.576724] ACPI: Preparing to enter system sleep state S5
[ 503.710857] Disabling non-boot CPUs ...
[ 503.716853] Power down.
[ 503.717770] ------------[ cut here ]------------
[ 503.717770] WARNING: at arch/x86/kernel/apic/apic.c:249 native_apic_write_du)
[ 503.717770] Hardware name: OptiPlex GX100
[ 503.717770] Modules linked in:
[ 503.717770] Pid: 2136, comm: halt Not tainted 2.6.30 #443
[ 503.717770] Call Trace:
[ 503.717770] [<c154d327>] ? printk+0x18/0x1a
[ 503.717770] [<c1017358>] ? native_apic_write_dummy+0x38/0x50
[ 503.717770] [<c10360fc>] warn_slowpath_common+0x6c/0xc0
[ 503.717770] [<c1017358>] ? native_apic_write_dummy+0x38/0x50
[ 503.717770] [<c1036165>] warn_slowpath_null+0x15/0x20
[ 503.717770] [<c1017358>] native_apic_write_dummy+0x38/0x50
[ 503.717770] [<c1017173>] disconnect_bsp_APIC+0x63/0x100
[ 503.717770] [<c1019e48>] disable_IO_APIC+0xb8/0xc0
[ 503.717770] [<c1214231>] ? acpi_power_off+0x0/0x29
[ 503.717770] [<c1015e55>] native_machine_shutdown+0x65/0x80
[ 503.717770] [<c1015c36>] native_machine_power_off+0x26/0x30
[ 503.717770] [<c1015c49>] machine_power_off+0x9/0x10
[ 503.717770] [<c1046596>] kernel_power_off+0x36/0x40
[ 503.717770] [<c104680d>] sys_reboot+0xfd/0x1f0
[ 503.717770] [<c109daa0>] ? perf_swcounter_event+0xb0/0x130
[ 503.717770] [<c109db7d>] ? perf_counter_task_sched_out+0x5d/0x120
[ 503.717770] [<c102dfc6>] ? finish_task_switch+0x56/0xd0
[ 503.717770] [<c154da1e>] ? schedule+0x49e/0xb40
[ 503.717770] [<c10444b0>] ? sys_kill+0x70/0x160
[ 503.717770] [<c119d9db>] ? selinux_file_ioctl+0x3b/0x50
[ 503.717770] [<c10dd443>] ? sys_ioctl+0x63/0x70
[ 503.717770] [<c1003024>] sysenter_do_call+0x12/0x22
[ 503.717770] ---[ end trace 8157b5d0ed378f15 ]---
|
| That's including this commit:
|
| commit 103428e57be323c3c5545db8ad12667099bc6005
|Author: Cyrill Gorcunov <gorcunov@openvz.org>
|Date: Sun Jun 7 16:48:40 2009 +0400
|
| x86, apic: Fix dummy apic read operation together with broken MP handling
|
If we have apic disabled we don't even switch to APIC mode and do not
calling for connect_bsp_APIC. Though on SMP compiled kernel the
native_machine_shutdown does try to write the apic register anyway.
Fix it with explicit check if we really should touch apic registers.
Reported-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20090617181322.GG10822@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
LKML-Reference: <1244895686-2348-1-git-send-email-weiyi.huang@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Expand Intel NMI perfctr1 workaround to include a Core2 processor stepping
(cpuid family-6, model-f, stepping-4). Resolves a situation where the NMI
would not enable on these processors.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: prarit@redhat.com
Cc: suresh.b.siddha@intel.com
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
This compiler warning:
arch/x86/kernel/apic/io_apic.c: In function ‘ioapic_write_entry’:
arch/x86/kernel/apic/io_apic.c:466: warning: ‘eu’ is used uninitialized in this function
arch/x86/kernel/apic/io_apic.c:465: note: ‘eu’ was declared here
Is bogus as 'eu' is always initialized. But annotate it away by
initializing the variable, to make it easier for people to notice
real warnings. A compiler that sees through this logic will
optimize away the initialization.
Signed-off-by: Figo.zhang <figo1802@gmail.com>
LKML-Reference: <1245248720.3312.27.camel@myhost>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
If APIC was disabled (for some reason) and as result
it's not even mapped we should not try to enable thermal
interrupts at all.
Reported-by: Simon Holm Thøgersen <odie@cs.aau.dk>
Tested-by: Simon Holm Thøgersen <odie@cs.aau.dk>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
LKML-Reference: <20090615182633.GA7606@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Vegard Nossum reported:
> I get an MCE-related crash like this in latest linus tree:
>
> [ 0.115341] CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
> [ 0.116396] CPU: L2 Cache: 512K (64 bytes/line)
> [ 0.120570] mce: CPU supports 0 MCE banks
> [ 0.124870] BUG: unable to handle kernel NULL pointer dereference at 00000000 00000010
> [ 0.128001] IP: [<ffffffff813b98ad>] mcheck_init+0x278/0x320
> [ 0.128001] PGD 0
> [ 0.128001] Thread overran stack, or stack corrupted
> [ 0.128001] Oops: 0002 [#1] PREEMPT SMP
> [ 0.128001] last sysfs file:
> [ 0.128001] CPU 0
> [ 0.128001] Modules linked in:
> [ 0.128001] Pid: 0, comm: swapper Not tainted 2.6.30 #426
> [ 0.128001] RIP: 0010:[<ffffffff813b98ad>] [<ffffffff813b98ad>] mcheck_init+0x278/0x320
> [ 0.128001] RSP: 0018:ffffffff81595e38 EFLAGS: 00000246
> [ 0.128001] RAX: 0000000000000010 RBX: ffffffff8158f900 RCX: 0000000000000000
> [ 0.128001] RDX: 0000000000000000 RSI: 00000000000000ff RDI: 0000000000000010
> [ 0.128001] RBP: ffffffff81595e68 R08: 0000000000000001 R09: 0000000000000000
> [ 0.128001] R10: 0000000000000010 R11: 0000000000000000 R12: 0000000000000000
> [ 0.128001] R13: 00000000ffffffff R14: 0000000000000000 R15: 0000000000000000
> [ 0.128001] FS: 0000000000000000(0000) GS:ffff880002288000(0000) knlGS:00000
> 00000000000
> [ 0.128001] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> [ 0.128001] CR2: 0000000000000010 CR3: 0000000001001000 CR4: 00000000000006b0
> [ 0.128001] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 0.128001] DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000
> [ 0.128001] Process swapper (pid: 0, threadinfo ffffffff81594000, task ffffff
> ff8152a4a0)
> [ 0.128001] Stack:
> [ 0.128001] 0000000081595e68 5aa50ed3b4ddbe6e ffffffff8158f900 ffffffff8158f
> 914
> [ 0.128001] ffffffff8158f948 0000000000000000 ffffffff81595eb8 ffffffff813b8
> 69c
> [ 0.128001] 5aa50ed3b4ddbe6e 00000001078bfbfd 0000062300000800 5aa50ed3b4ddb
> e6e
> [ 0.128001] Call Trace:
> [ 0.128001] [<ffffffff813b869c>] identify_cpu+0x331/0x392
> [ 0.128001] [<ffffffff815a1445>] identify_boot_cpu+0x23/0x6e
> [ 0.128001] [<ffffffff815a14ac>] check_bugs+0x1c/0x60
> [ 0.128001] [<ffffffff8159c075>] start_kernel+0x403/0x46e
> [ 0.128001] [<ffffffff8159b2ac>] x86_64_start_reservations+0xac/0xd5
> [ 0.128001] [<ffffffff8159b3ea>] x86_64_start_kernel+0x115/0x14b
> [ 0.128001] [<ffffffff8159b140>] ? early_idt_handler+0x0/0x71
This happens on QEMU which reports MCA capability, but no banks.
Without this patch there is a buffer overrun and boot ops because
the code would try to initialize the 0 element of a zero length
kmalloc() buffer.
Reported-by: Vegard Nossum <vegard.nossum@gmail.com>
Tested-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
LKML-Reference: <20090615125200.GD31969@one.firstfloor.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Merge reason: pull in latest to fix a bug in it.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
This patch causes all the EFI_RESERVED_TYPE memory reservations to be recorded
in the e820 table as type E820_RESERVED.
(This patch replaces one called 'x86: vendor reserved memory type'.
This version has been discussed a bit with Peter and Yinghai but not given
a final opinion.)
Without this patch EFI_RESERVED_TYPE memory reservations may be
marked usable in the e820 table. There may be a collision between
kernel use and some reserver's use of this memory.
(An example use of this functionality is the UV system, which
will access extremely large areas of memory with a memory engine
that allows a user to address beyond the processor's range. Such
areas are reserved in the EFI table by the BIOS.
Some loaders have a restricted number of entries possible in the e820 table,
hence the need to record the reservations in the unrestricted EFI table.)
The call to do_add_efi_memmap() is only made if "add_efi_memmap" is specified
on the kernel command line.
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
|
iomem_resource is by default initialized to -1, which means 64 bits of
physical address space if 64-bit resources are enabled. However, x86
CPUs cannot address 64 bits of physical address space. Thus, we want
to cap the physical address space to what the union of all CPU can
actually address.
Without this patch, we may end up assigning inaccessible values to
uninitialized 64-bit PCI memory resources.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Martin Mares <mj@ucw.cz>
Cc: stable@kernel.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into x86/urgent
|
|
Now that enable_iommus() will call iommu_disable() for each iommu,
the call to disable_iommus() during resume is redundant. Also, the order
for an invalidation is to invalidate device table entries first, then
domain translations.
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-for-linus-migration' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
timers: Logic to move non pinned timers
timers: /proc/sys sysctl hook to enable timer migration
timers: Identifying the existing pinned timers
timers: Framework for identifying pinned timers
timers: allow deferrable timers for intervals tv2-tv5 to be deferred
Fix up conflicts in kernel/sched.c and kernel/timer.c manually
|
|
These registers may contain values from previous kernels. So reset them
to known values before enable the event buffer again.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
The IOMMU spec states that IOMMU behavior may be undefined when the
IOMMU registers are rewritten while command or event buffer is enabled.
Disable them in IOMMU disable path.
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
When kexec'ing to a new kernel (for example, when crashing and launching
a kdump session), the AMD IOMMU may have cached translations. The kexec'd
kernel, during initialization, will invalidate the IOMMU device table
entries, but not the domain translations. These stale entries can cause
a device's DMA to fail, makes it rough to write a dump to disk when the
disk controller can't DMA ;-)
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
If the IOMMUs are still enabled when the kexec kernel boots access to
the disk is not possible. This is bad for tools like kdump or anything
else which wants to use PCI devices.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
When the IOMMU stays enabled the BIOS may not be able to finish the
machine shutdown properly. So disable the hardware on shutdown.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
timer interrupts are excluded from being disabled during suspend. The
clock events code manages the disabling of clock events on its own
because the timer interrupt needs to be functional before the resume
code reenables the device interrupts.
The hpet per cpu timers request their interrupt without setting the
IRQF_TIMER flag so suspend_device_irqs() disables them as well which
results in a fatal resume failure on the boot CPU.
Adding IRQF_TIMER to the interupt flags when requesting the hpet per
cpu timer interrupts solves the problem.
Reported-by: Benjamin S. <sbenni@gmx.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Benjamin S. <sbenni@gmx.de>
Cc: stable@kernel.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (80 commits)
x86, mce: Add boot options for corrected errors
x86, mce: Fix mce printing
x86, mce: fix for mce counters
x86, mce: support action-optional machine checks
x86, mce: define MCE_VECTOR
x86, mce: rename mce_notify_user to mce_notify_irq
x86: fix panic with interrupts off (needed for MCE)
x86, mce: export MCE severities coverage via debugfs
x86, mce: implement new status bits
x86, mce: print header/footer only once for multiple MCEs
x86, mce: default to panic timeout for machine checks
x86, mce: improve mce_get_rip
x86, mce: make non Monarch panic message "Fatal machine check" too
x86, mce: switch x86 machine check handler to Monarch election.
x86, mce: implement panic synchronization
x86, mce: implement bootstrapping for machine check wakeups
x86, mce: check early in exception handler if panic is needed
x86, mce: add table driven machine check grading
x86, mce: remove TSC print heuristic
x86, mce: log corrected errors when panicing
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
PM: Add empty suspend/resume device irq functions
PM/Hibernate: Move NVS routines into a seperate file (v2).
PM/Hibernate: Rename disk.c to hibernate.c
PM: Separate suspend to RAM functionality from core
Driver Core: Rework platform suspend/resume, print warning
PM: Remove device_type suspend()/resume()
PM/Hibernate: Move memory shrinking to snapshot.c (rev. 2)
PM/Suspend: Do not shrink memory before suspend
PM: Remove bus_type suspend_late()/resume_early() V2
PM core: rename suspend and resume functions
PM: Rename device_power_down/up()
PM: Remove unused asm/suspend.h
x86: unify power/cpu_(32|64).c
x86: unify power/cpu_(32|64) copyright notes
x86: unify power/cpu_(32|64) regarding restoring processor state
x86: unify power/cpu_(32|64) regarding saving processor state
x86: unify power/cpu_(32|64) global variables
x86: unify power/cpu_(32|64) headers
PM: Warn if interrupts are enabled during suspend-resume of sysdevs
PM/ACPI/x86: Fix sparse warning in arch/x86/kernel/acpi/sleep.c
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf_counter: Start documenting HAVE_PERF_COUNTERS requirements
perf_counter: Add forward/backward attribute ABI compatibility
perf record: Explicity program a default counter
perf_counter: Remove PERF_TYPE_RAW special casing
perf_counter: PERF_TYPE_HW_CACHE is a hardware counter too
powerpc, perf_counter: Fix performance counter event types
perf_counter/x86: Add a quirk for Atom processors
perf_counter tools: Remove one L1-data alias
|
|
This patch (as1241) renames a bunch of functions in the PM core.
Rather than go through a boring list of name changes, suffice it to
say that in the end we have a bunch of pairs of functions:
device_resume_noirq dpm_resume_noirq
device_resume dpm_resume
device_complete dpm_complete
device_suspend_noirq dpm_suspend_noirq
device_suspend dpm_suspend
device_prepare dpm_prepare
in which device_X does the X operation on a single device and dpm_X
invokes device_X for all devices in the dpm_list.
In addition, the old dpm_power_up and device_resume_noirq have been
combined into a single function (dpm_resume_noirq).
Lastly, dpm_suspend_start and dpm_resume_end are the renamed versions
of the former top-level device_suspend and device_resume routines.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
|
|
Rename the functions performing "_noirq" dev_pm_ops
operations from device_power_down() and device_power_up()
to device_suspend_noirq() and device_resume_noirq().
The new function names are chosen to show that the functions
are responsible for calling the _noirq() versions to finalize
the suspend/resume operation. The current function names do
not perform power down/up anymore so the names may be misleading.
Global function renames:
- device_power_down() -> device_suspend_noirq()
- device_power_up() -> device_resume_noirq()
Static function renames:
- suspend_device_noirq() -> __device_suspend_noirq()
- resume_device_noirq() -> __device_resume_noirq()
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Len Brown <lenb@kernel.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
|
|
One of the numbers in arch/x86/kernel/acpi/sleep.c is long, but it is
not annotated appropriately, so sparese warns about it. Fix that.
[rjw: added the changelog.]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6
* 'topic/slab/earlyboot-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
slab: setup cpu caches later on when interrupts are enabled
slab,slub: don't enable interrupts during early boot
slab: fix gfp flag in setup_cpu_cache()
x86: make zap_low_mapping could be used early
irq: slab alloc for default irq_affinity
memcg: fix page_cgroup fatal error in FLATMEM
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-lguest: (31 commits)
lguest: add support for indirect ring entries
lguest: suppress notifications in example Launcher
lguest: try to batch interrupts on network receive
lguest: avoid sending interrupts to Guest when no activity occurs.
lguest: implement deferred interrupts in example Launcher
lguest: remove obsolete LHREQ_BREAK call
lguest: have example Launcher service all devices in separate threads
lguest: use eventfds for device notification
eventfd: export eventfd_signal and eventfd_fget for lguest
lguest: allow any process to send interrupts
lguest: PAE fixes
lguest: PAE support
lguest: Add support for kvm_hypercall4()
lguest: replace hypercall name LHCALL_SET_PMD with LHCALL_SET_PGD
lguest: use native_set_* macros, which properly handle 64-bit entries when PAE is activated
lguest: map switcher with executable page table entries
lguest: fix writev returning short on console output
lguest: clean up length-used value in example launcher
lguest: Segment selectors are 16-bit long. Fix lg_cpu.ss1 definition.
lguest: beyond ARRAY_SIZE of cpu->arch.gdt
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-module-and-param
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-module-and-param:
module: cleanup FIXME comments about trimming exception table entries.
module: trim exception table on init free.
module: merge module_alloc() finally
uml module: fix uml build process due to this merge
x86 module: merge the rest functions with macros
x86 module: merge the same functions in module_32.c and module_64.c
uvesafb: improve parameter handling.
module_param: allow 'bool' module_params to be bool, not just int.
module_param: add __same_type convenience wrapper for __builtin_types_compatible_p
module_param: split perm field into flags and perm
module_param: invbool should take a 'bool', not an 'int'
cyber2000fb.c: use proper method for stopping unload if CONFIG_ARCH_SHARK
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Provide _sdata in the vmlinux.lds.S file
x86: handle initrd that extends into unusable memory
|
|
The downside of the last patch which made restore_flags and irq_enable
check interrupts is that they are now too big to be patched directly
into the callsites, so the C versions are always used.
But the C versions go via PV_CALLEE_SAVE_REGS_THUNK which saves all
the registers. In fact, we don't need any registers in the fast path,
so we can do better than this if we actually code them in assembler.
The results are in the noise, but since it's about the same amount of
code, it's worth applying.
1GB Guest->Host: input(suppressed),output(suppressed)
Before:
Seconds: 0:16.53
Packets: 377268,753673
Interrupts: 22461,24297
Notifications: 1(5245),21303(732370)
Net IRQs triggered: 377023(245),42578(711095)
After:
Seconds: 0:16.48
Packets: 377289,753673
Interrupts: 22281,24465
Notifications: 1(5245),21296(732377)
Net IRQs triggered: 377060(229),42564(711109)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Everyone cut and paste this comment from my original one. We now do
it generically, so cut the comments.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Amerigo Wang <amwang@redhat.com>
|
|
As Christoph Hellwig suggested, module_alloc() actually can be
unified for i386 and x86_64 (of course, also UML).
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: 'Ingo Molnar' <mingo@elte.hu>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Merge the rest functions together, with proper preprocessing directives.
Finally remove module_{32|64}.c.
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Merge the same functions both in module_32.c and module_64.c into
module.c.
This is the first step to merge both of them finally.
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
The fixed-function performance counters do not work on current Atom
processors. Use the general-purpose ones instead.
Signed-off-by: Yong Wang <yong.y.wang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20090612080855.GA2286@ywang-moblin2.bj.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Only one cpu is there, just call __flush_tlb for it. Fixes the following boot
warning on x86:
[ 0.000000] Memory: 885032k/915540k available (5993k kernel code, 29844k reserved, 3842k data, 428k init, 0k highmem)
[ 0.000000] virtual kernel memory layout:
[ 0.000000] fixmap : 0xffe17000 - 0xfffff000 (1952 kB)
[ 0.000000] vmalloc : 0xf8615000 - 0xffe15000 ( 120 MB)
[ 0.000000] lowmem : 0xc0000000 - 0xf7e15000 ( 894 MB)
[ 0.000000] .init : 0xc19a5000 - 0xc1a10000 ( 428 kB)
[ 0.000000] .data : 0xc15da4bb - 0xc199af6c (3842 kB)
[ 0.000000] .text : 0xc1000000 - 0xc15da4bb (5993 kB)
[ 0.000000] Checking if this processor honours the WP bit even in supervisor mode...Ok.
[ 0.000000] ------------[ cut here ]------------
[ 0.000000] WARNING: at kernel/smp.c:369 smp_call_function_many+0x50/0x1b0()
[ 0.000000] Hardware name: System Product Name
[ 0.000000] Modules linked in:
[ 0.000000] Pid: 0, comm: swapper Not tainted 2.6.30-tip #52504
[ 0.000000] Call Trace:
[ 0.000000] [<c104aa16>] warn_slowpath_common+0x65/0x95
[ 0.000000] [<c104aa58>] warn_slowpath_null+0x12/0x15
[ 0.000000] [<c1073bbe>] smp_call_function_many+0x50/0x1b0
[ 0.000000] [<c1037615>] ? do_flush_tlb_all+0x0/0x41
[ 0.000000] [<c1037615>] ? do_flush_tlb_all+0x0/0x41
[ 0.000000] [<c1073d4f>] smp_call_function+0x31/0x58
[ 0.000000] [<c1037615>] ? do_flush_tlb_all+0x0/0x41
[ 0.000000] [<c104f635>] on_each_cpu+0x26/0x65
[ 0.000000] [<c10374b5>] flush_tlb_all+0x19/0x1b
[ 0.000000] [<c1032ab3>] zap_low_mappings+0x4d/0x56
[ 0.000000] [<c15d64b5>] ? printk+0x14/0x17
[ 0.000000] [<c19b42a8>] mem_init+0x23d/0x245
[ 0.000000] [<c19a56a1>] start_kernel+0x17a/0x2d5
[ 0.000000] [<c19a5347>] ? unknown_bootoption+0x0/0x19a
[ 0.000000] [<c19a5039>] __init_begin+0x39/0x41
[ 0.000000] ---[ end trace 4eaa2a86a8e2da22 ]---
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
|
|
_sdata is a common symbol defined by many architectures and made
available to the kernel via asm-generic/sections.h. Kmemleak uses this
symbol when scanning the data sections.
[ Impact: add new global symbol ]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
LKML-Reference: <20090511122105.26556.96593.stgit@pc1117.cambridge.arm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
So we make sure MAXSMP gets a cleared cpumask
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
On a system where system memory (according e820) is not covered by
mtrr, mtrr_trim_memory converts a portion of memory to reserved, but
bootloader has already put the initrd in that range.
Thus, we need to have 64bit to use relocate_initrd too.
[ Impact: fix using initrd when mtrr_trim_memory happen ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: stable@kernel.org
|
|
Conflicts:
arch/x86/kernel/cpu/mcheck/mce_64.c
arch/x86/kernel/irq.c
Merge reason: Resolve the conflicts above.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perfcounters-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (574 commits)
perf_counter: Turn off by default
perf_counter: Add counter->id to the throttle event
perf_counter: Better align code
perf_counter: Rename L2 to LL cache
perf_counter: Standardize event names
perf_counter: Rename enums
perf_counter tools: Clean up u64 usage
perf_counter: Rename perf_counter_limit sysctl
perf_counter: More paranoia settings
perf_counter: powerpc: Implement generalized cache events for POWER processors
perf_counters: powerpc: Add support for POWER7 processors
perf_counter: Accurate period data
perf_counter: Introduce struct for sample data
perf_counter tools: Normalize data using per sample period data
perf_counter: Annotate exit ctx recursion
perf_counter tools: Propagate signals properly
perf_counter tools: Small frequency related fixes
perf_counter: More aggressive frequency adjustment
perf_counter/x86: Fix the model number of Intel Core2 processors
perf_counter, x86: Correct some event and umask values for Intel processors
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6
* 'topic/slab/earlyboot' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
vgacon: use slab allocator instead of the bootmem allocator
irq: use kcalloc() instead of the bootmem allocator
sched: use slab in cpupri_init()
sched: use alloc_cpumask_var() instead of alloc_bootmem_cpumask_var()
memcg: don't use bootmem allocator in setup code
irq/cpumask: make memoryless node zero happy
x86: remove some alloc_bootmem_cpumask_var calling
vt: use kzalloc() instead of the bootmem allocator
sched: use kzalloc() instead of the bootmem allocator
init: introduce mm_init()
vmalloc: use kzalloc() instead of alloc_bootmem()
slab: setup allocators earlier in the boot sequence
bootmem: fix slab fallback on numa
bootmem: use slab if bootmem is no longer available
|
|
* 'kvm-updates/2.6.31' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (138 commits)
KVM: Prevent overflow in largepages calculation
KVM: Disable large pages on misaligned memory slots
KVM: Add VT-x machine check support
KVM: VMX: Rename rmode.active to rmode.vm86_active
KVM: Move "exit due to NMI" handling into vmx_complete_interrupts()
KVM: Disable CR8 intercept if tpr patching is active
KVM: Do not migrate pending software interrupts.
KVM: inject NMI after IRET from a previous NMI, not before.
KVM: Always request IRQ/NMI window if an interrupt is pending
KVM: Do not re-execute INTn instruction.
KVM: skip_emulated_instruction() decode instruction if size is not known
KVM: Remove irq_pending bitmap
KVM: Do not allow interrupt injection from userspace if there is a pending event.
KVM: Unprotect a page if #PF happens during NMI injection.
KVM: s390: Verify memory in kvm run
KVM: s390: Sanity check on validity intercept
KVM: s390: Unlink vcpu on destroy - v2
KVM: s390: optimize float int lock: spin_lock_bh --> spin_lock
KVM: s390: use hrtimer for clock wakeup from idle - v2
KVM: s390: Fix memory slot versus run - v3
...
|
|
Don't hardcode to node zero for early boot IRQ setup memory allocations.
[ penberg@cs.helsinki.fi: minor cleanups ]
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
|
|
Now that we set up the slab allocator earlier, we can get rid of some
alloc_bootmem_cpumask_var() calls in boot code.
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
|
|
Conflicts:
arch/x86/kernel/irqinit.c
arch/x86/kernel/irqinit_64.c
arch/x86/kernel/traps.c
arch/x86/mm/fault.c
include/linux/sched.h
kernel/exit.c
|
|
The top (fastest) and last level (biggest) caches are the most
interesting ones, performance wise.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
[ Fixed the Nehalem LL table to LLC Reference/Miss events ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|