Age | Commit message (Collapse) | Author |
|
Previously, one would have to specifically choose CONFIG_OLPC and
CONFIG_PCI_GOOLPC in order to enable PCI_OLPC. That doesn't really work
for distro kernels, so this patch allows one to choose CONFIG_OLPC and
CONFIG_PCI_GOANY in order to build in OLPC support in a generic kernel (as
requested by Robert Millan).
This also moves GOOLPC before GOANY in the menuconfig list.
Finally, make pci_access_init return early if we detect OLPC hardware.
There's no need to continue probing stuff, and pci_pcbios_init
specifically trashes our settings (we didn't run into that before because
PCI_GOANY wasn't supported).
Signed-off-by: Andres Salomon <dilinger@debian.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
Here is an attempt to translate the prompt and help text into something
which is legible and, as a bonus, correct.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Jürgen Mell reported an FPU state corruption bug under CONFIG_PREEMPT,
and bisected it to commit v2.6.19-1363-gacc2076, "i386: add sleazy FPU
optimization".
Add tsk_used_math() checks to prevent calling math_state_restore()
which can sleep in the case of !tsk_used_math(). This prevents
making a blocking call in __switch_to().
Apparently "fpu_counter > 5" check is not enough, as in some signal handling
and fork/exec scenarios, fpu_counter > 5 and !tsk_used_math() is possible.
It's a side effect though. This is the failing scenario:
process 'A' in save_i387_ia32() just after clear_used_math()
Got an interrupt and pre-empted out.
At the next context switch to process 'A' again, kernel tries to restore
the math state proactively and sees a fpu_counter > 0 and !tsk_used_math()
This results in init_fpu() during the __switch_to()'s math_state_restore()
And resulting in fpu corruption which will be saved/restored
(save_i387_fxsave and restore_i387_fxsave) during the remaining
part of the signal handling after the context switch.
Bisected-by: Jürgen Mell <j.mell@t-online.de>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Tested-by: Jürgen Mell <j.mell@t-online.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org
|
|
iommu/gart support misses suspend/resume code, which can do bad stuff,
including memory corruption on resume. Prevent system suspend in case we
would be unable to resume.
Signed-off-by: Pavel Machek <pavel@suse.cz>
Tested-by: Patrick <ragamuffin@datacomm.ch>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Fix this:
WARNING: vmlinux.o(.text+0x114bb): Section mismatch in reference from
the function nopat() to the function .cpuinit.text:pat_disable()
The function nopat() references
the function __cpuinit pat_disable().
This is often because nopat lacks a __cpuinit
annotation or the annotation of pat_disable is wrong.
Reported-by: "Fabio Comolli" <fabio.comolli@gmail.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Clarify the usage of mtrr_lookup() in PAT code, and to make PAT code
resilient to mtrr lookup problems.
Specifically, pat_x_mtrr_type() is restructured to highlight, under what
conditions we look for mtrr hint. pat_x_mtrr_type() uses a default type
when there are any errors in mtrr lookup (still maintaining the pat
consistency). And, reserve_memtype() highlights its usage ot mtrr_lookup
for request type of '-1' and also defaults in a sane way on any mtrr
lookup failure.
pat.c looks at mtrr type of a range to get a hint on what mapping type
to request when user/API: (1) hasn't specified any type (/dev/mem
mapping) and we do not want to take performance hit by always mapping
UC_MINUS. This will be the case for /dev/mem mappings used to map BIOS
area or ACPI region which are WB'able. In this case, as long as MTRR is
not WB, PAT will request UC_MINUS for such mappings.
(2) user/API requests WB mapping while in reality MTRR may have UC or
WC. In this case, PAT can map as WB (without checking MTRR) and still
effective type will be UC or WC. But, a subsequent request to map same
region as UC or WC may fail, as the region will get trackked as WB in
PAT list. Looking at MTRR hint helps us to track based on effective type
rather than what user requested. Again, here mtrr_lookup is only used as
hint and we fallback to WB mapping (as requested by user) as default.
In both cases, after using the mtrr hint, we still go through the
memtype list to make sure there are no inconsistencies among multiple
users.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Tested-by: Rufus & Azrael <rufus-azrael@numericable.fr>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Changed the call to find_e820_area_size to pass u64 instead of unsigned long.
Signed-off-by: Kevin Winchester <kjwinchester@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
OGAWA Hirofumi and Fede have reported rare pmd_ERROR messages:
mm/memory.c:127: bad pmd ffff810000207xxx(9090909090909090).
Initialization's cleanup_highmap was leaving alignment filler
behind in the pmd for MODULES_VADDR: when vmalloc's guard page
would occupy a new page table, it's not allocated, and then
module unload's vfree hits the bad 9090 pmd entry left over.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Mika Kukkonen noticed that the nesting check in early_iounmap() is not
actually done.
Reported-by: Mika Kukkonen <mikukkon@srv1-m700-lanp.koti>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: torvalds@linux-foundation.org
Cc: arjan@linux.intel.com
Cc: mikukkon@iki.fi
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Fix the math emulation that got broken with the recent lazy allocation of FPU
area. init_fpu() need to be added for the math-emulation path aswell
for the FPU area allocation.
math emulation enabled kernel booted fine with this, in the presence
of "no387 nofxsr" boot param.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: hpa@zytor.com
Cc: mingo@elte.hu
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The RT team has been searching for a nasty latency. This latency shows
up out of the blue and has been seen to be as big as 5ms!
Using ftrace I found the cause of the latency.
pcscd-2995 3dNh1 52360300us : irq_exit (smp_apic_timer_interrupt)
pcscd-2995 3dN.2 52360301us : idle_cpu (irq_exit)
pcscd-2995 3dN.2 52360301us : rcu_irq_exit (irq_exit)
pcscd-2995 3dN.1 52360771us : smp_apic_timer_interrupt (apic_timer_interrupt
)
pcscd-2995 3dN.1 52360771us : exit_idle (smp_apic_timer_interrupt)
Here's an example of a 400 us latency. pcscd took a timer interrupt and
returned with "need resched" enabled, but did not reschedule until after
the next interrupt came in at 52360771us 400us later!
At first I thought we somehow missed a preemption check in entry.S. But
I also noticed that this always seemed to happen during a __delay call.
pcscd-2995 3dN.2 52360836us : rcu_irq_exit (irq_exit)
pcscd-2995 3.N.. 52361265us : preempt_schedule (__delay)
Looking at the x86 delay, I found my problem.
In git commit 35d5d08a085c56f153458c3f5d8ce24123617faf, Andrew Morton
placed preempt_disable around the entire delay due to TSC's not working
nicely on SMP. Unfortunately for those that care about latencies this
is devastating! Especially when we have callers to mdelay(8).
Here I enable preemption during the loop and account for anytime the task
migrates to a new CPU. The delay asked for may be extended a bit by
the migration, but delay only guarantees that it will delay for that minimum
time. Delaying longer should not be an issue.
[
Thanks to Thomas Gleixner for spotting that cpu wasn't updated,
and to place the rep_nop between preempt_enabled/disable.
]
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: akpm@osdl.org
Cc: Clark Williams <clark.williams@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Luis Claudio R. Goncalves" <lclaudio@uudg.org>
Cc: Gregory Haskins <ghaskins@novell.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andi Kleen <andi-suse@firstfloor.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Priit Laes reported the following warning:
Call Trace:
[<ffffffff8022f1e1>] warn_on_slowpath+0x51/0x63
[<ffffffff80282e48>] sys_ioctl+0x2d/0x5d
[<ffffffff805185ff>] _spin_lock+0xe/0x24
[<ffffffff80227459>] task_rq_lock+0x3d/0x73
[<ffffffff805133c3>] set_cpu_sibling_map+0x336/0x350
[<ffffffff8021c1b8>] read_apic_id+0x30/0x62
[<ffffffff806d921d>] verify_local_APIC+0x90/0x138
[<ffffffff806d84b5>] native_smp_prepare_cpus+0x1f9/0x305
[<ffffffff806ce7b1>] kernel_init+0x59/0x2d9
[<ffffffff80518a26>] _spin_unlock_irq+0x11/0x2b
[<ffffffff8020bf48>] child_rip+0xa/0x12
[<ffffffff806ce758>] kernel_init+0x0/0x2d9
[<ffffffff8020bf3e>] child_rip+0x0/0x12
fix this by generally disabling preemption in native_smp_prepare_cpus().
Reported-and-bisected-by: Priit Laes <plaes@plaes.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
for http://bugzilla.kernel.org/show_bug.cgi?id=10613
BIOS bug, APIC version is 0 for CPU#0! fixing up to 0x10. (tell your hw vendor)
v2: fix 64 bit compilation
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Gabriel C <nix.or.die@googlemail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Define recently added XEN_ELFNOTEs, and use them appropriately.
Most significantly, this enables domain checkpointing (xm save -c).
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
On resume, the vcpu timer modes will not be restored. The timer
infrastructure doesn't do this for us, since it assumes the cpus
are offline. We can just poke the other vcpus into the right mode
directly though.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
If we're using vcpu_info mapping, then make sure its restored on all
processors before relasing them from stop_machine.
The only complication is that if this fails, we can't continue because
we've already made assumptions that the mapping is available (baked in
calls to the _direct versions of the functions, for example).
Fortunately this can only happen with a 32-bit hypervisor, which may
possibly run out of mapping space. On a 64-bit hypervisor, this is a
non-issue.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
When operating on an unpinned pagetable (ie, one under construction or
destruction), it isn't necessary to use a hypercall to update a
pud/pmd entry. Jan Beulich observed that a similar optimisation
avoided many thousands of hypercalls while doing a kernel build.
One tricky part is that early in the kernel boot there's no page
structure, so we can't check to see if the page is pinned. In that
case, we just always use the hypercall.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
-tip testing found the following xen-console symbols trouble:
ERROR: "get_phys_to_machine" [drivers/video/xen-fbfront.ko] undefined!
ERROR: "get_phys_to_machine" [drivers/net/xen-netfront.ko] undefined!
ERROR: "get_phys_to_machine" [drivers/input/xen-kbdfront.ko] undefined!
with:
http://redhat.com/~mingo/misc/config-Mon_Jun__2_12_25_13_CEST_2008.bad
|
|
iommu/gart support misses suspend/resume code, which can do bad stuff,
including memory corruption on resume. Prevent system suspend in case we
would be unable to resume.
Signed-off-by: Pavel Machek <pavel@suse.cz>
Tested-by: Patrick <ragamuffin@datacomm.ch>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
On Wed, 2008-05-28 at 04:47 +0200, Andi Kleen wrote:
> > So... why not just remove the setting of __GFP_NORETRY? Why is it
> > wrong to oom-kill things in this case?
>
> When the 16MB zone overflows (which can be common in some workloads)
> calling the OOM killer is pretty useless because it has barely any
> real user data [only exception would be the "only 16MB" case Alan
> mentioned]. Killing random processes in this case is bad.
>
> I think for 16MB __GFP_NORETRY is ok because there should be
> nothing freeable in there so looping is useless. Only exception would be the
> "only 16MB total" case again but I'm not sure 2.6 supports that at all
> on x86.
>
> On the other hand d_a_c() does more allocations than just 16MB, especially
> on 64bit and the other zones need different strategies.
Okay, so how about this then ?
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
This BIOS claims the VIA 8237 south bridge to be compatible with VIA 586,
which it is not.
Without this patch, I get the following warning while booting,
among others,
| PCI: Using IRQ router VIA [1106/3227] at 0000:00:11.0
| ------------[ cut here ]------------
| WARNING: at arch/x86/pci/irq.c:265 pirq_via586_get+0x4a/0x60()
| Modules linked in:
| Pid: 1, comm: swapper Not tainted 2.6.26-rc4-00015-g1ec7d99 #1
| [<c0119fd4>] warn_on_slowpath+0x54/0x70
| [<c02246e0>] ? vt_console_print+0x210/0x2b0
| [<c02244d0>] ? vt_console_print+0x0/0x2b0
| [<c011a413>] ? __call_console_drivers+0x43/0x60
| [<c011a482>] ? _call_console_drivers+0x52/0x80
| [<c011aa89>] ? release_console_sem+0x1c9/0x200
| [<c0291d21>] ? raw_pci_read+0x41/0x70
| [<c0291e8f>] ? pci_read+0x2f/0x40
| [<c029151a>] pirq_via586_get+0x4a/0x60
| [<c02914d0>] ? pirq_via586_get+0x0/0x60
| [<c029178d>] pcibios_lookup_irq+0x15d/0x430
| [<c03b895a>] pcibios_irq_init+0x17a/0x3e0
| [<c03a66f0>] ? kernel_init+0x0/0x250
| [<c03a6763>] kernel_init+0x73/0x250
| [<c03b87e0>] ? pcibios_irq_init+0x0/0x3e0
| [<c0114d00>] ? schedule_tail+0x10/0x40
| [<c0102dee>] ? ret_from_fork+0x6/0x1c
| [<c03a66f0>] ? kernel_init+0x0/0x250
| [<c03a66f0>] ? kernel_init+0x0/0x250
| [<c010324b>] kernel_thread_helper+0x7/0x1c
| =======================
| ---[ end trace 4eaa2a86a8e2da22 ]---
and IRQ trouble later,
| irq 10: nobody cared (try booting with the "irqpoll" option)
Now that's an VIA 8237 chip, so pirq_via586_get shouldn't be called
at all; adding this workaround to via_router_probe() fixes the
problem for me.
Amazingly I have a 2.6.23.8 kernel that somehow works fine ... I'll
never understand why.
Signed-off-by: Bertram Felgenhauer <int-e@gmx.de>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Before:
root@ubuntu:~# cat /proc/interrupts
CPU0
1: 1672 lguest-<NULL> virtio0
2: 1 lguest-<NULL> virtio1
...
After:
root@ubuntu:~# cat /proc/interrupts
CPU0
1: 2889 lguest-level virtio0
2: 9 lguest-level virtio1
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
-tip tree auto-testing found the following early bootup hang:
-------------->
get_memcfg_from_srat: assigning address to rsdp
RSD PTR v0 [Nvidia]
BUG: Int 14: CR2 ffd00040
EDI 8092fbfe ESI ffd00040 EBP 80b0aee8 ESP 80b0aed0
EBX 000f76f0 EDX 0000000e ECX 00000003 EAX ffd00040
err 00000000 EIP 802c055a CS 00000060 flg 00010006
Stack: ffd00040 80bc78d0 80b0af6c 80b1dbfe 8093d8ba 00000008 80b42810 80b4ddb4
80b42842 00000000 80b0af1c 801079c8 808e724e 00000000 80b42871 802c0531
00000100 00000000 0003fff0 80b0af40 80129999 00040100 00040100 00000000
Pid: 0, comm: swapper Not tainted 2.6.26-rc4-sched-devel.git #570
[<802c055a>] ? strncmp+0x11/0x25
[<80b1dbfe>] ? get_memcfg_from_srat+0xb4/0x568
[<801079c8>] ? mcount_call+0x5/0x9
[<802c0531>] ? strcmp+0xa/0x22
[<80129999>] ? printk+0x38/0x3a
[<80129999>] ? printk+0x38/0x3a
[<8011b122>] ? memory_present+0x66/0x6f
[<80b216b4>] ? setup_memory+0x13/0x40c
[<80b16b47>] ? propagate_e820_map+0x80/0x97
[<80b1622a>] ? setup_arch+0x248/0x477
[<80129999>] ? printk+0x38/0x3a
[<80b11759>] ? start_kernel+0x6e/0x2eb
[<80b110fc>] ? i386_start_kernel+0xeb/0xf2
=======================
<------
with this config:
http://redhat.com/~mingo/misc/config-Wed_May_28_01_33_33_CEST_2008.bad
The thing is, the crash makes little sense at first sight. We crash on a
benign-looking printk. The code around it got changed in -tip but
checking those topic branches individually did not reproduce the bug.
Bisection led to this commit:
| d5edbc1f75420935b1ec7e65df10c8f81cea82de is first bad commit
| commit d5edbc1f75420935b1ec7e65df10c8f81cea82de
| Author: Jeremy Fitzhardinge <jeremy@goop.org>
| Date: Mon May 26 23:31:22 2008 +0100
|
| xen: add p2m mfn_list_list
Which is somewhat surprising, as on native hardware Xen client side
should have little to no side-effects.
After some head scratching, it turns out the following happened:
randconfig enabled the following Xen options:
CONFIG_XEN=y
CONFIG_XEN_MAX_DOMAIN_MEMORY=8
# CONFIG_XEN_BLKDEV_FRONTEND is not set
# CONFIG_XEN_NETDEV_FRONTEND is not set
CONFIG_HVC_XEN=y
# CONFIG_XEN_BALLOON is not set
which activated this piece of code in arch/x86/xen/mmu.c:
> @@ -69,6 +69,13 @@
> __attribute__((section(".data.page_aligned"))) =
> { [ 0 ... TOP_ENTRIES - 1] = &p2m_missing[0] };
>
> +/* Arrays of p2m arrays expressed in mfns used for save/restore */
> +static unsigned long p2m_top_mfn[TOP_ENTRIES]
> + __attribute__((section(".bss.page_aligned")));
> +
> +static unsigned long p2m_top_mfn_list[TOP_ENTRIES / P2M_ENTRIES_PER_PAGE]
> + __attribute__((section(".bss.page_aligned")));
The problem is, you must only put variables into .bss.page_aligned that
have a _size_ that is _exactly_ page aligned. In this case the size of
p2m_top_mfn_list is not page aligned:
80b8d000 b p2m_top_mfn
80b8f000 b p2m_top_mfn_list
80b8f008 b softirq_stack
80b97008 b hardirq_stack
80b9f008 b bm_pte
So all subsequent variables get unaligned which, depending on luck,
breaks the kernel in various funny ways. In this case what killed the
kernel first was the misaligned bootmap pte page, resulting in that
creative crash above.
Anyway, this was a fun bug to track down :-)
I think the moral is that .bss.page_aligned is a dangerous construct in
its current form, and the symptoms of breakage are very non-trivial, so
i think we need build-time checks to make sure all symbols in
.bss.page_aligned are truly page aligned.
The Xen fix below gets the kernel booting again.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Hook into the device model to make sure that timekeeping's resume handler
is called. This deals with our clocksource's non-monotonicity over the
save/restore. Explicitly call clock_has_changed() to make sure that
all the timers get retriggered properly.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
This patch implements Xen save/restore and migration.
Saving is triggered via xenbus, which is polled in
drivers/xen/manage.c. When a suspend request comes in, the kernel
prepares itself for saving by:
1 - Freeze all processes. This is primarily to prevent any
partially-completed pagetable updates from confusing the suspend
process. If CONFIG_PREEMPT isn't defined, then this isn't necessary.
2 - Suspend xenbus and other devices
3 - Stop_machine, to make sure all the other vcpus are quiescent. The
Xen tools require the domain to run its save off vcpu0.
4 - Within the stop_machine state, it pins any unpinned pgds (under
construction or destruction), performs canonicalizes various other
pieces of state (mostly converting mfns to pfns), and finally
5 - Suspend the domain
Restore reverses the steps used to save the domain, ending when all
the frozen processes are thawed.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
When saving a domain, the Xen tools need to remap all our mfns to
portable pfns. In order to remap our p2m table, it needs to know
where all its pages are, so maintain the references to the p2m table
for it to use.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Rename dummy_shared_info to xen_dummy_shared_info and make it
non-static, in anticipation of users outside of enlighten.c
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
When using sparsemem and memory hotplug, the kernel's pseudo-physical
address space can be discontigious. Previously this was dealt with by
having the upper parts of the radix tree stubbed off. Unfortunately,
this is incompatible with save/restore, which requires a complete p2m
table.
The solution is to have a special distinguished all-invalid p2m leaf
page, which we can point all the hole areas at. This allows the tools
to see a complete p2m table, but it only costs a page for all memory
holes.
It also simplifies the code since it removes a few special cases.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Add a config option to set the max size of a Xen domain. This is used
to scale the size of the physical-to-machine array; it ends up using
around 1 page/GByte, so there's no reason to be very restrictive.
For a 32-bit guest, the default value of 8GB is probably sufficient;
there's not much point in giving a 32-bit machine much more memory
than that.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
We now support the use of memory hotplug, so the physical to machine
page mapping structure must be dynamic. This is implemented as a
two-level radix tree structure, which allows us to efficiently
incrementally allocate memory for the p2m table as new pages are
added.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Make sure resched interrupts appear in /proc/interrupts in the proper
place.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
move arch/x86/xen/manage.c under drivers/xen/to share codes
with x86 and ia64.
ia64/xen also uses manage.c
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
For some perverse reason, if you call add_preferred_console() it prevents
setup_early_printk() from successfully enabling the boot console -
unless you make it a preferred console too...
Also, make xenboot console output distinct from normal console output,
since it gets repeated when the console handover happens, and the
duplicated output is confusing without disambiguation.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
|
|
Without console= arguments on the kernel command line, the first
console to register becomes enabled and the preferred console (the one
behind /dev/console). This is normally tty (assuming
CONFIG_VT_CONSOLE is enabled, which it commonly is).
This is okay as long tty is a useful console. But unless we have the
PV framebuffer, and it is enabled for this domain, tty0 in domU is
merely a dummy. In that case, we want the preferred console to be the
Xen console hvc0, and we want it without having to fiddle with the
kernel command line. Commit b8c2d3dfbc117dff26058fbac316b8acfc2cb5f7
did that for us.
Since we now have the PV framebuffer, we want to enable and prefer tty
again, but only when PVFB is enabled. But even then we still want to
enable the Xen console as well.
Problem: when tty registers, we can't yet know whether the PVFB is
enabled. By the time we can know (xenstore is up), the console setup
game is over.
Solution: enable console tty by default, but keep hvc as the preferred
console. Change the preferred console to tty when PVFB probes
successfully, unless we've been given console kernel parameters.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Add pte_flags() to extract the flags from a pte. This is a special
case of pte_val() which is only guaranteed to return the pte's flags
correctly; the page number may be corrupted or missing.
The intent is to allow paravirt implementations to return pte flags
without having to do any translation of the page number (most notably,
Xen).
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
When enabling interrupts, we don't need to worry about preemption,
because we either enter with interrupts disabled - so no preemption -
or the caller is confused and is re-enabling interrupts on some
indeterminate processor.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The guest can legitimately change things like cr4.OSFXSR and
OSXMMEXCPT, so let it.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Use the new sched_op hypercall, mainly because xenner doesn't support
the old one.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Xen will trap and emulate clts, but its better to use a hypercall.
Also, xenner doesn't handle clts.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
init/Kconfig contains a list of configs that are searched
for if 'make *config' are used with no .config present.
Extend this list to look at the config identified by
ARCH_DEFCONFIG.
With this change we now try the defconfig targets last.
This fixes a regression reported
by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip:
x86: prevent PGE flush from interruption/preemption
x86: use explicit copy in vdso_gettimeofday()
namespacecheck: automated fixes
x86/xen: fix arbitrary_virt_to_machine()
x86: don't read maxlvt before checking if APIC is mapped
x86: disable TSC for sched_clock() when calibration failed
x86: distangle user disabled TSC from unstable
x86: fix setup of cyc2ns in tsc_64.c
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
[CPUFREQ] clarify license of freq_table.c
[CPUFREQ] Remove documentation of removed ondemand tunable.
[CPUFREQ] Crusoe: longrun cpufreq module reports false min freq
[CPUFREQ] powernow-k8: improve error messages
|
|
arch/x86/boot/printf.c:59:10: warning: Using plain integer as NULL pointer
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Jeremy's gcc 3.4 seems to be unable to inline a 8 byte memcpy. But the
vdso doesn't support external references. Copy the structure members
of struct timezone explicitely instead.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
While I realize that the function isn't currently being used, I still
think an obvious mistake like this should be corrected.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
A check for unmapped apic was added before reading maxlvt but the early
read of maxlvt wasn't removed.
Signed-off-by: Chuck Ebbert <cebbert@redhat.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org
|
|
When the TSC calibration fails then TSC is still used in
sched_clock(). Disable it completely in that case.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org
|
|
tsc_enabled is set to 0 from the command line switch "notsc" and from
the mark_tsc_unstable code. Seperate those functionalities and replace
tsc_enable with tsc_disable. This makes also the native_sched_clock()
decision when to use TSC understandable.
Preparatory patch to solve the sched_clock() issue on 32 bit.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
When the TSC is calibrated against the PIT due to the nonavailability
of PMTIMER/HPET or due to SMI interference then the setup of the per
CPU cyc2ns variables is skipped. This is unlikely to happen but it
would definitely render sched_clock() unusable.
This was introduced with commit 53d517cdbaac704352b3d0c10fecb99e0b54572e
x86: scale cyc_2_nsec according to CPU frequency
Update the per CPU cyc2ns variables in all exit pathes of tsc_calibrate.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org
|