aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/kvm
AgeCommit message (Collapse)Author
2008-04-27KVM: SVM: align shadow CR4.MCE with hostJoerg Roedel
This patch aligns the host version of the CR4.MCE bit with the CR4 active in the guest. This is necessary to get MCE exceptions when the guest is running. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: SVM: indent svm_set_cr4 with tabs instead of spacesJoerg Roedel
The svm_set_cr4 function is indented with spaces. This patch replaces them with tabs. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: MMU: Don't assume struct page for x86Anthony Liguori
This patch introduces a gfn_to_pfn() function and corresponding functions like kvm_release_pfn_dirty(). Using these new functions, we can modify the x86 MMU to no longer assume that it can always get a struct page for any given gfn. We don't want to eliminate gfn_to_page() entirely because a number of places assume they can do gfn_to_page() and then kmap() the results. When we support IO memory, gfn_to_page() will fail for IO pages although gfn_to_pfn() will succeed. This does not implement support for avoiding reference counting for reserved RAM or for IO memory. However, it should make those things pretty straight forward. Since we're only introducing new common symbols, I don't think it will break the non-x86 architectures but I haven't tested those. I've tested Intel, AMD, NPT, and hugetlbfs with Windows and Linux guests. [avi: fix overflow when shifting left pfns by adding casts] Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: MMU: prepopulate guest pages after write-protectingMarcelo Tosatti
Zdenek reported a bug where a looping "dmsetup status" eventually hangs on SMP guests. The problem is that kvm_mmu_get_page() prepopulates the shadow MMU before write protecting the guest page tables. By doing so, it leaves a window open where the guest can mark a pte as present while the host has shadow cached such pte as "notrap". Accesses to such address will fault in the guest without the host having a chance to fix the situation. Fix by moving the write protection before the pte prefetch. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: MMU: Only mark_page_accessed() if the page was accessed by the guestAvi Kivity
If the accessed bit is not set, the guest has never accessed this page (at least through this spte), so there's no need to mark the page accessed. This provides more accurate data for the eviction algortithm. Noted by Andrea Arcangeli. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: Free apic access page on vm destructionAvi Kivity
Noticed by Marcelo Tosatti. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: MMU: allow the vm to shrink the kvm mmu shadow cachesIzik Eidus
Allow the Linux memory manager to reclaim memory in the kvm shadow cache. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: MMU: unify slots_lock usageMarcelo Tosatti
Unify slots_lock acquision around vcpu_run(). This is simpler and less error-prone. Also fix some callsites that were not grabbing the lock properly. [avi: drop slots_lock while in guest mode to avoid holding the lock for indefinite periods] Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: VMX: Enable MSR Bitmap featureSheng Yang
MSR Bitmap controls whether the accessing of an MSR causes VM Exit. Eliminating exits on automatically saved and restored MSRs yields a small performance gain. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: x86: hardware task switching supportIzik Eidus
This emulates the x86 hardware task switch mechanism in software, as it is unsupported by either vmx or svm. It allows operating systems which use it, like freedos, to run as kvm guests. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: x86: add functions to get the cpl of vcpuIzik Eidus
Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: VMX: Add module option to disable flexpriorityAvi Kivity
Useful for debugging. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: no longer EXPERIMENTALAvi Kivity
Long overdue. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: MMU: Introduce and use spte_to_page()Avi Kivity
Encapsulate the pte mask'n'shift in a function. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: MMU: fix dirty bit setting when removing write permissionsIzik Eidus
When mmu_set_spte() checks if a page related to spte should be release as dirty or clean, it check if the shadow pte was writeble, but in case rmap_write_protect() is called called it is possible for shadow ptes that were writeble to become readonly and therefor mmu_set_spte will release the pages as clean. This patch fix this issue by marking the page as dirty inside rmap_write_protect(). Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: MMU: Set the accessed bit on non-speculative shadow ptesAvi Kivity
If we populate a shadow pte due to a fault (and not speculatively due to a pte write) then we can set the accessed bit on it, as we know it will be set immediately on the next guest instruction. This saves a read-modify-write operation. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: MMU: hypercall based pte updates and TLB flushesMarcelo Tosatti
Hypercall based pte updates are faster than faults, and also allow use of the lazy MMU mode to batch operations. Don't report the feature if two dimensional paging is enabled. [avi: - one mmu_op hypercall instead of one per op - allow 64-bit gpa on hypercall - don't pass host errors (-ENOMEM) to guest] [akpm: warning fix on i386] Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: Provide unlocked version of emulator_write_phys()Avi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: add basic paravirt supportMarcelo Tosatti
Add basic KVM paravirt support. Avoid vm-exits on IO delays. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: Add reset support for in kernel PITSheng Yang
Separate the reset part and prepare for reset support. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: Add save/restore supporting of in kernel PITSheng Yang
Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: In kernel PIT modelSheng Yang
The patch moves the PIT model from userspace to kernel, and increases the timer accuracy greatly. [marcelo: make last_injected_time per-guest] Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Tested-and-Acked-by: Alex Davis <alex14641@yahoo.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: Remove pointless desc_ptr #ifdefAvi Kivity
The desc_struct changes left an unnecessary #ifdef; remove it. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: VMX: Don't adjust tsc offset forwardAvi Kivity
Most Intel hosts have a stable tsc, and playing with the offset only reduces accuracy. By limiting tsc offset adjustment only to forward updates, we effectively disable tsc offset adjustment on these hosts. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: replace remaining __FUNCTION__ occurancesHarvey Harrison
__FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: detect if VCPU triple faultsJoerg Roedel
In the current inject_page_fault path KVM only checks if there is another PF pending and injects a DF then. But it has to check for a pending DF too to detect a shutdown condition in the VCPU. If this is not detected the VCPU goes to a PF -> DF -> PF loop when it should triple fault. This patch detects this condition and handles it with an KVM_SHUTDOWN exit to userspace. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: Prefix control register accessors with kvm_ to avoid namespace pollutionAvi Kivity
Names like 'set_cr3()' look dangerously close to affecting the host. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: MMU: large page supportMarcelo Tosatti
Create large pages mappings if the guest PTE's are marked as such and the underlying memory is hugetlbfs backed. If the largepage contains write-protected pages, a large pte is not used. Gives a consistent 2% improvement for data copies on ram mounted filesystem, without NPT/EPT. Anthony measures a 4% improvement on 4-way kernbench, with NPT. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: MMU: ignore zapped root pagetablesMarcelo Tosatti
Mark zapped root pagetables as invalid and ignore such pages during lookup. This is a problem with the cr3-target feature, where a zapped root table fools the faulting code into creating a read-only mapping. The result is a lockup if the instruction can't be emulated. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Cc: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: Implement dummy values for MSR_PERF_STATUSAlexander Graf
Darwin relies on this and ceases to work without. Signed-off-by: Alexander Graf <alex@csgraf.de> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: sparse fixes for kvm/x86.cHarvey Harrison
In two case statements, use the ever popular 'i' instead of index: arch/x86/kvm/x86.c:1063:7: warning: symbol 'index' shadows an earlier one arch/x86/kvm/x86.c:1000:9: originally declared here arch/x86/kvm/x86.c:1079:7: warning: symbol 'index' shadows an earlier one arch/x86/kvm/x86.c:1000:9: originally declared here Make it static. arch/x86/kvm/x86.c:1945:24: warning: symbol 'emulate_ops' was not declared. Should it be static? Drop the return statements. arch/x86/kvm/x86.c:2878:2: warning: returning void-valued expression arch/x86/kvm/x86.c:2944:2: warning: returning void-valued expression Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: SVM: make iopm_base staticHarvey Harrison
Fixes sparse warning as well. arch/x86/kvm/svm.c:69:15: warning: symbol 'iopm_base' was not declared. Should it be static? Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: x86 emulator: fix sparse warnings in x86_emulate.cHarvey Harrison
Nesting __emulate_2op_nobyte inside__emulate_2op produces many shadowed variable warnings on the internal variable _tmp used by both macros. Change the outer macro to use __tmp. Avoids a sparse warning like the following at every call site of __emulate_2op arch/x86/kvm/x86_emulate.c:1091:3: warning: symbol '_tmp' shadows an earlier one arch/x86/kvm/x86_emulate.c:1091:3: originally declared here [18 more warnings suppressed] Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: Add stat counter for hypercallsAmit Shah
Signed-off-by: Amit Shah <amit.shah@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: Use x86's segment descriptor struct instead of private definitionAvi Kivity
The x86 desc_struct unification allows us to remove segment_descriptor.h. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: Add API for determining the number of supported memory slotsAvi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: Add API to retrieve the number of supported vcpus per vmAvi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: x86 emulator: make register_address_increment and JMP_REL static inlinesHarvey Harrison
Change jmp_rel() to a function as well. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: x86 emulator: make register_address, address_mask static inlinesHarvey Harrison
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: x86 emulator: add ad_mask static inlineHarvey Harrison
Replaces open-coded mask calculation in macros. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: paravirtualized clocksource: host partGlauber de Oliveira Costa
This is the host part of kvm clocksource implementation. As it does not include clockevents, it is a fairly simple implementation. We only have to register a per-vcpu area, and start writing to it periodically. The area is binary compatible with xen, as we use the same shadow_info structure. [marcelo: fix bad_page on MSR_KVM_SYSTEM_TIME] [avi: save full value of the msr, even if enable bit is clear] [avi: clear previous value of time_page] Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: SVM: enable LBR virtualizationJoerg Roedel
This patch implements the Last Branch Record Virtualization (LBRV) feature of the AMD Barcelona and Phenom processors into the kvm-amd module. It will only be enabled if the guest enables last branch recording in the DEBUG_CTL MSR. So there is no increased world switch overhead when the guest doesn't use these MSRs. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Markus Rechberger <markus.rechberger@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: SVM: allocate the MSR permission map per VCPUJoerg Roedel
This patch changes the kvm-amd module to allocate the SVM MSR permission map per VCPU instead of a global map for all VCPUs. With this we have more flexibility allowing specific guests to access virtualized MSRs. This is required for LBR virtualization. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Markus Rechberger <markus.rechberger@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: SVM: let init_vmcb() take struct vcpu_svm as parameterJoerg Roedel
Change the parameter of the init_vmcb() function in the kvm-amd module from struct vmcb to struct vcpu_svm. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Markus Rechberger <markus.rechberger@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: VMX: fix typo in VMX header defineRyan Harper
Looking at Intel Volume 3b, page 148, table 20-11 and noticed that the field name is 'Deliver' not 'Deliever'. Attached patch changes the define name and its user in vmx.c Signed-off-by: Ryan Harper <ryanh@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: SVM: add support for Nested PagingJoerg Roedel
This patch contains the SVM architecture dependent changes for KVM to enable support for the Nested Paging feature of AMD Barcelona and Phenom processors. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: MMU: add TDP support to the KVM MMUJoerg Roedel
This patch contains the changes to the KVM MMU necessary for support of the Nested Paging feature in AMD Barcelona and Phenom Processors. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: export the load_pdptrs() function to modulesJoerg Roedel
The load_pdptrs() function is required in the SVM module for NPT support. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: MMU: make the __nonpaging_map function genericJoerg Roedel
The mapping function for the nonpaging case in the softmmu does basically the same as required for Nested Paging. Make this function generic so it can be used for both. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: export information about NPT to generic x86 codeJoerg Roedel
The generic x86 code has to know if the specific implementation uses Nested Paging. In the generic code Nested Paging is called Two Dimensional Paging (TDP) to avoid confusion with (future) TDP implementations of other vendors. This patch exports the availability of TDP to the generic x86 code. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>