aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2008-01-30KVM: Add ioctl to tss address from userspace,Izik Eidus
Currently kvm has a wart in that it requires three extra pages for use as a tss when emulating real mode on Intel. This patch moves the allocation internally, only requiring userspace to tell us where in the physical address space we can place the tss. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30KVM: Per-architecture hypercall definitionsChristian Borntraeger
Currently kvm provides hypercalls only for x86* architectures. To provide hypercall infrastructure for other kvm architectures I split kvm_para.h into a generic header file and architecture specific definitions. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30KVM: Support assigning userspace memory to the guestIzik Eidus
Instead of having the kernel allocate memory to the guest, let userspace allocate it and pass the address to the kernel. This is required for s390 support, but also enables features like memory sharing and using hugetlbfs backed memory. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30KVM: Allow dynamic allocation of the mmu shadow cache sizeIzik Eidus
The user is now able to set how many mmu pages will be allocated to the guest. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30KVM: Refactor hypercall infrastructure (v3)Anthony Liguori
This patch refactors the current hypercall infrastructure to better support live migration and SMP. It eliminates the hypercall page by trapping the UD exception that would occur if you used the wrong hypercall instruction for the underlying architecture and replacing it with the right one lazily. A fall-out of this patch is that the unhandled hypercalls no longer trap to userspace. There is very little reason though to use a hypercall to communicate with userspace as PIO or MMIO can be used. There is no code in tree that uses userspace hypercalls. [avi: fix #ud injection on vmx] Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-30x86: use the same pgd_list for PAE and 64-bitJeremy Fitzhardinge
Use a standard list threaded through page->lru for maintaining the pgd list on PAE. This is the same as 64-bit, and seems saner than using a non-standard list via page->index. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: defer cr3 reload when doing pud_clear()Jeremy Fitzhardinge
PAE mode requires that we reload cr3 in order to guarantee that changes to the pgd will be noticed by the processor. This means that in principle pud_clear needs to reload cr3 every time. However, because reloading cr3 implies a tlb flush, we want to avoid it where possible. pud_clear() is only used in a couple of places: - in free_pmd_range(), when pulling down a range of process address space, and - huge_pmd_unshare() In both cases, the calling code will do a a tlb flush anyway, so there's no need to do it within pud_clear(). In free_pmd_range(), the pud_clear is immediately followed by pmd_free_tlb(); we can hook that to make the mmu_gather do an unconditional full flush to make sure cr3 gets reloaded. In huge_pmd_unshare, it is followed by flush_tlb_range, which always results in a full cr3-reload tlb flush. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Andi Kleen <ak@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: William Irwin <wli@holomorphy.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: early boot debugging via FireWire (ohci1394_dma=early)Bernhard Kaindl
This patch adds a new configuration option, which adds support for a new early_param which gets checked in arch/x86/kernel/setup_{32,64}.c:setup_arch() to decide wether OHCI-1394 FireWire controllers should be initialized and enabled for physical DMA access to allow remote debugging of early problems like issues ACPI or other subsystems which are executed very early. If the config option is not enabled, no code is changed, and if the boot paramenter is not given, no new code is executed, and independent of that, all new code is freed after boot, so the config option can be even enabled in standard, non-debug kernels. With specialized tools, it is then possible to get debugging information from machines which have no serial ports (notebooks) such as the printk buffer contents, or any data which can be referenced from global pointers, if it is stored below the 4GB limit and even memory dumps of of the physical RAM region below the 4GB limit can be taken without any cooperation from the CPU of the host, so the machine can be crashed early, it does not matter. In the extreme, even kernel debuggers can be accessed in this way. I wrote a small kgdb module and an accompanying gdb stub for FireWire which allows to gdb to talk to kgdb using remote remory reads and writes over FireWire. An version of the gdb stub fore FireWire is able to read all global data from a system which is running a a normal kernel without any kernel debugger, without any interruption or support of the system's CPU. That way, e.g. the task struct and so on can be read and even manipulated when the physical DMA access is granted. A HOWTO is included in this patch, in Documentation/debugging-via-ohci1394.txt and I've put a copy online at ftp://ftp.suse.de/private/bk/firewire/docs/debugging-via-ohci1394.txt It also has links to all the tools which are available to make use of it another copy of it is online at: ftp://ftp.suse.de/private/bk/firewire/kernel/ohci1394_dma_early-v2.diff Signed-Off-By: Bernhard Kaindl <bk@suse.de> Tested-By: Thomas Renninger <trenn@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: don't special-case pmd allocations as muchJeremy Fitzhardinge
In x86 PAE mode, stop treating pmds as a special case. Previously they were always allocated and freed with the pgd. The modifies the code to be the same as 64-bit mode, where they are allocated on demand. This is a step on the way to unifying 32/64-bit pagetable allocation as much as possible. There is a complicating wart, however. When you install a new reference to a pmd in the pgd, the processor isn't guaranteed to see it unless you reload cr3. Since reloading cr3 also has the side-effect of flushing the tlb, this is an expense that we want to avoid whereever possible. This patch simply avoids reloading cr3 unless the update is to the current pagetable. Later patches will optimise this further. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Andi Kleen <ak@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: William Irwin <wli@holomorphy.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: arch/x86/mm/init_32.c cleanupIngo Molnar
Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: make ioremap() UC by defaultIngo Molnar
Yes! A mere 120 c_p_a() fixing and rewriting patches later, we are now confident that we can enable UC by default for ioremap(), on x86 too. Every other architectures was doing this already. Doing so makes Linux more robust against MTRR mixups (which might go unnoticed if BIOS writers test other OSs only - where PAT might override bad MTRRs defaults). Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: cpa: fix the self-testIngo Molnar
Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: fix clflush_page_range logicIngo Molnar
only present ptes must be flushed. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30x86: add testcases for RODATA and NX protections/attributesArjan van de Ven
Latest update; I now have 4 NX tests, but 2 fail so they're #if 0'd. I also cleaned up the NX test code quite a bit, and got rid of the ugly exception table sorting stuff. From: Arjan van de Ven <arjan@linux.intel.com> This patch adds testcases for the CONFIG_DEBUG_RODATA configuration option as well as the NX CPU feature/mappings. Both testcases can move to tests/ once that patch gets merged into mainline. (I'm half considering moving the rodata test into mm/init.c but I'll wait with that until init.c is unified) As part of this I had to fix a not-quite-right alignment in the vmlinux.lds.h for the RODATA sections, which lead to 1 page less being marked read only. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: remove flush_agp_mappings()Ingo Molnar
Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: cpa: move flush to cpaThomas Gleixner
The set_memory_* and set_pages_* family of API's currently requires the callers to do a global tlb flush after the function call; forgetting this is a very nasty deathtrap. This patch moves the global tlb flush into each of the callers Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: make various pageattr.c functions staticArjan van de Ven
change_page_attr_add is only used in pageattr.c now, so we can make this function static. change_page_attr() isn't used anywere at all anymore; this function is a really bad API anyway so just remove the bloat entirely. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: cpa: set_memory_notpresent()Ingo Molnar
Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: fix ioremap APIThomas Gleixner
Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: fix the missing BIOS area check in page_is_ramThomas Gleixner
page_is_ram has a FIXME since ages, which reminds to sanity check the BIOS area between 640k and 1M, which is sometimes falsely reported as RAM in the e820 tables. Implement the sanity check. Move the BIOS range defines from pageattr.c into e820.h to avoid duplicate defines. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30x86: move page_is_ram() functionThomas Gleixner
Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: deprecate change_page_attr() for driversArjan van de Ven
With the introduction of the new API, no driver or non-archcore code needs to use c-p-a anymore, so this patch also deprecates the EXPORT_SYMBOL of CPA (it's a horrible API after all). Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30x86: convert CPA users to the new set_page_ APIArjan van de Ven
This patch converts various users of change_page_attr() to the new, more intent driven set_page_*/set_memory_* API set. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30x86: a new API for drivers/etc to control cache and other page attributesArjan van de Ven
Right now, if drivers or other code want to change, say, a cache attribute of a page, the only API they have is change_page_attr(). c-p-a is a really bad API for this, because it forces the caller to know *ALL* the attributes he wants for the page, not just the 1 thing he wants to change. So code that wants to set a page uncachable, needs to be aware of the NX status as well etc etc etc. This patch introduces a set of new APIs for this, set_pages_<attr> and set_memory_<attr>, that offer a logical change to the user, and leave all attributes not implied by the requested logical change alone. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30x86: introduce max_pfn_mappedThomas Gleixner
64bit uses end_pfn_map and 32bit uses max_low_pfn. There are several files which have #ifdef'ed defines which map either to end_pfn_map or max_low_pfn. Replace this by a universal define and clean up all the other instances. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30x86: add PAGE_KERNEL_EXEC_NOCACHEIngo Molnar
add PAGE_KERNEL_EXEC_NOCACHE. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: add PG_LEVEL enumThomas Gleixner
this way PG_LEVEL_1GB will be an easy change. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: clean up lookup_address() declarationsThomas Gleixner
Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: clean up arch/x86/mm/pageattr.cIngo Molnar
do some leftover cleanups in the now unified arch/x86/mm/pageattr.c file. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: introduce native_set_pte_atomic() on 64-bit tooIngo Molnar
Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: do not PSE on CONFIG_DEBUG_PAGEALLOC=yIngo Molnar
get more testing of the c_p_a() code done by not turning off PSE on DEBUG_PAGEALLOC. this simplifies the early pagetable setup code, and tests the largepage-splitup code quite heavily. In the end, all the largepages will be split up pretty quickly, so there's no difference to how DEBUG_PAGEALLOC worked before. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: fix some bugs about EFI runtime code mappingHuang, Ying
This patch fixes some bugs of making EFI runtime code executable. - Use change_page_attr in i386 too. Because the runtime code may be mapped not through ioremap. - If there is no _PAGE_NX in __supported_pte_mask, the change_page_attr is not called. - Make efi_ioremap map pages as PAGE_KERNEL_EXEC_NOCACHE, because EFI runtime code may be mapped through efi_ioremap. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: remove set_kernel_exec()Andi Kleen
The SMP trampoline always runs in real mode, so making it executable in the page tables doesn't make much sense because it executes before page tables are set up. That was the only user of set_kernel_exec(). Remove set_kernel_exec(). Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: introduce canon_pgprot()Andi Kleen
Introduce canon_pgprot() Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: don't drop NX bit in pte modifier functions on 32-bitAndi Kleen
The pte_* modifier functions that cleared bits dropped the NX bit on 32bit PAE because they only worked in int, but NX is in bit 63. Fix that by adding appropiate casts so that the arithmetic happens as long long on PAE kernels. I decided to just use 64bit arithmetic instead of open coding like pte_modify() because gcc should generate good enough code for that now. Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: add pte_pgprot to 32-bitAndi Kleen
64bit already had it. Needed for later patches. Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: shrink __PAGE_KERNEL/__PAGE_KERNEL_EXEC on non PAE kernelsAndi Kleen
No need to make it 64bit there. Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: fix early_ioremap()/btmapIngo Molnar
fix a long-standing weakness of the early-ioremap allocator: it uses a single pgd entry for the boot mappings, and was not properly protecting itself against crossing a 2MB (4MB) boundary. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: increase the number of boot-mappingsIngo Molnar
increase max early_ioremap() remapping size from 64K to 256K. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: enhance early_ioremap()Ingo Molnar
- allow nesting of up to 4 levels Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86 32-bit boot: rename bt_ioremap() to early_ioremap()Huang, Ying
This patch renames bt_ioremap to early_ioremap, which is used in x86_64. This makes it easier to merge i386 and x86_64 usage. [ mingo@elte.hu: fix ] Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: replace boot_ioremap() with enhanced bt_ioremap() - remove boot_ioremap()Huang, Ying
This patch replaces boot_ioremap invokation with bt_ioremap and removes the boot_ioremap implementation. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30i386 boot: replace boot_ioremap with enhanced bt_ioremap - enhance bt_ioremapHuang, Ying
This patch makes it possible for bt_ioremap() to be used before paging_init(), via providing an early implementation of set_fixmap() that can be used before paging_init(). This way boot_ioremap() can be replaced by bt_ioremap(). Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: return the page table level in lookup_address()Ingo Molnar
based on this patch from Andi Kleen: | Subject: CPA: Return the page table level in lookup_address() | From: Andi Kleen <ak@suse.de> | | Needed for the next change. | | And change all the callers. and ported it to x86.git. Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: add pte accessors for the global bitAndi Kleen
Needed for some test code. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: clean up pte_execAndi Kleen
- Rename it to pte_exec() from pte_exec_kernel(). There is nothing kernel specific in there. - Move it into the common file because _PAGE_NX is 0 on !PAE and then pte_exec() will be always evaluate to true. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: change ioremap() to default to uncachedIngo Molnar
Prepare ioremap() to default to uncached. This will be the safest - but first we have to fix CPA. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: demacro asm-x86/pgalloc_32.hJeremy Fitzhardinge
Convert macros into inline functions, for better type-checking. This patch required a little bit of fiddling with headers in order to make __(pte|pmd)_free_tlb inline rather than macros. asm-generic/tlb.h includes asm/pgalloc.h, though it doesn't directly use any pgalloc definitions. I removed this include to avoid an include cycle, but it may cause secondary compile failures by things depending on the indirect inclusion; arch/x86/mm/hugetlbpage.c was one such place; there may be others. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: add mm parameter to paravirt_alloc_pdJeremy Fitzhardinge
Add mm to paravirt_alloc_pd, partly to make it consistent with paravirt_alloc_pt, and because later changes will make use of it. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: add support for the RDC R-321x SoCFlorian Fainelli
This patch adds support for the RDC R-321x system-on-chip, also known as R-861x-(G). It uses the generic GPIO API and has support for the on-chip hardware watchdog. Build-fix from: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Florian Fainelli <florian.fainelli@telecomint.eu> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>