aboutsummaryrefslogtreecommitdiff
path: root/drivers/kvm/vmx.c
AgeCommit message (Collapse)Author
2007-10-13KVM: Migrate lapic hrtimer when vcpu moves to another cpuEddie Dong
This reduces overhead by accessing cachelines from the wrong node, as well as simplifying locking. [Qing: fix for inactive or expired one-shot timer] Signed-off-by: Yaozu (Eddie) Dong <Eddie.Dong@intel.com> Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Keep track of missed timer irq injectionsEddie Dong
APIC timer IRQ is set every time when a certain period expires at host time, but the guest may be descheduled at that time and thus the irq be overwritten by later fire. This patch keep track of firing irq numbers and decrease only when the IRQ is injected to guest or buffered in APIC. Signed-off-by: Yaozu (Eddie) Dong <Eddie.Dong@intel.com> Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: VMX: Use shadow TPR/cr8 for 64-bits guestsYang, Sheng
This patch enables TPR shadow of VMX on CR8 access. 64bit Windows using CR8 access TPR frequently. The TPR shadow can improve the performance of access TPR by not causing vmexit. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: pending irq save/restoreEddie Dong
Add in kernel irqchip save/restore support for pending vectors. [avi: fix compile warning on i386] [avi: remove printk] Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Emulate hlt in the kernelEddie Dong
By sleeping in the kernel when hlt is executed, we simplify the in-kernel guest interrupt path considerably. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Emulate local APIC in kernelEddie Dong
Because lightweight exits (exits which don't involve userspace) are many times faster than heavyweight exits, it makes sense to emulate high usage devices in the kernel. The local APIC is one such device, especially for Windows and for SMP, so we add an APIC model to kvm. It also allows in-kernel host-side drivers to inject interrupts without going through userspace. [compile fix on i386 from Jindrich Makovicka] Signed-off-by: Yaozu (Eddie) Dong <Eddie.Dong@intel.com> Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Define and use cr8 access functionsEddie Dong
This patch is to wrap APIC base register and CR8 operation which can provide a unique API for user level irqchip and kernel irqchip. This is a preparation of merging lapic/ioapic patch. Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Add support for in-kernel PIC emulationEddie Dong
Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: VMX: Split segments reload in vmx_load_host_state()Laurent Vivier
vmx_load_host_state() bundles fs, gs, ldt, and tss reloading into one in the hope that it is infrequent. With smp guests, fs reloading is frequent due to fs being used by threads. Unbundle the reloads so reduce expensive gs reloads. Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: VMX: allow rmode_tss_base() to work with >2G of guest memoryIzik Eidus
Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Communicate cr8 changes to userspaceYang, Sheng
This allows running 64-bit Windows. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Close minor race in signal handlingAvi Kivity
We need to check for signals inside the critical section, otherwise a signal can be sent which we will not notice. Also move the check before entry, so that if the signal happens before the first entry, we exit immediately instead of waiting for something to happen to the guest. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Clean up kvm_setup_pio()Laurent Vivier
Split kvm_setup_pio() into two functions, one to setup in/out pio (kvm_emulate_pio()) and one to setup ins/outs pio (kvm_emulate_pio_string()). Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Cleanup string I/O instruction emulationLaurent Vivier
Both vmx and svm decode the I/O instructions, and both botch the job, requiring the instruction prefixes to be fetched in order to completely decode the instruction. So, if we see a string I/O instruction, use the x86 emulator to decode it, as it already has all the prefix decoding machinery. This patch defines ins/outs opcodes in x86_emulate.c and calls emulate_instruction() from io_interception() (svm.c) and from handle_io() (vmx.c). It removes all vmx/svm prefix instruction decoders (get_addr_size(), io_get_override(), io_address(), get_io_count()) Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: VMX: Remove a duplicated ia32e mode vm entry controlLi, Xin B
Remove a duplicated ia32e mode VM Entry control definition and use the proper one. Signed-off-by: Xin Li <xin.b.li@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Use kmem_cache_free for kmem_cache_zalloc'ed objectsRusty Russell
We use kfree in svm.c and vmx.c, and this works, but it could break at any time. kfree() is supposed to match up with kmalloc(). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Add and use pr_unimpl for standard formatting of unimplemented featuresRusty Russell
All guest-invokable printks should be ratelimited to prevent malicious guests from flooding logs. This is a start. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Fix defined but not used warning in drivers/kvm/vmx.cGabriel C
move_msr_up() is used only on X86_64 and generates a warning on !X86_64 Signed-off-by: Gabriel Craciunescu <nix.or.die@googlemail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Remove redundant alloc_vmcs_cpu declarationRusty Russell
alloc_vmcs_cpu is already declared (static) above, no need to redeclare. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: VMX: Add cpu consistency checkYang, Sheng
All the physical CPUs on the board should support the same VMX feature set. Add check_processor_compatibility to kvm_arch_ops for the consistency check. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Use kmem cache for allocating vcpusRusty Russell
Avi wants the allocations of vcpus centralized again. The easiest way is to add a "size" arg to kvm_init_arch, and expose the thus-prepared cache to the modules. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Remove kvm_{read,write}_guest()Laurent Vivier
... in favor of the more general emulator_{read,write}_*. Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: VMX: pass vcpu_vmx internallyRusty Russell
container_of is wonderful, but not casting at all is better. This patch changes vmx.c's internal functions to pass "struct vcpu_vmx" instead of "struct kvm_vcpu" and using container_of. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Convert vm lock to a mutexShaohua Li
This allows the kvm mmu to perform sleepy operations, such as memory allocation. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Use the scheduler preemption notifiers to make kvm preemptibleAvi Kivity
Current kvm disables preemption while the new virtualization registers are in use. This of course is not very good for latency sensitive workloads (one use of virtualization is to offload user interface and other latency insensitive stuff to a container, so that it is easier to analyze the remaining workload). This patch re-enables preemption for kvm; preemption is now only disabled when switching the registers in and out, and during the switch to guest mode and back. Contains fixes from Shaohua Li <shaohua.li@intel.com>. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: VMX: Improve the method of writing vmcs controlYang, Sheng
Put cpu feature detecting part in hardware_setup, and stored the vmcs condition in global variable for further check. [glommer: fix for some i386-only machines not supporting CR8 load/store exiting] Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Dynamically allocate vcpusRusty Russell
This patch converts the vcpus array in "struct kvm" to a pointer array, and changes the "vcpu_create" and "vcpu_setup" hooks into one "vcpu_create" call which does the allocation and initialization of the vcpu (calling back into the kvm_vcpu_init core helper). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Remove arch specific components from the general codeGregory Haskins
struct kvm_vcpu has vmx-specific members; remove them to a private structure. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: VMX: Import some constants of vmcs from IA32 SDMYang, Sheng
This patch mainly imports some constants and rename two exist constants of vmcs according to IA32 SDM. It also adds two constants to indicate Lock bit and Enable bit in MSR_IA32_FEATURE_CONTROL, and replace the hardcode _5_ with these two bits. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Hoist kvm_mmu_reload() out of the critical sectionShaohua Li
vmx_cpu_run doesn't handle error correctly and kvm_mmu_reload might sleep with mutex changes, so I move it above. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Set exit_reason to KVM_EXIT_MMIO where run->mmio is initialized.Jeff Dike
Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Use standard CR4 flags, tighten checkingRusty Russell
On this machine (Intel), writing to the CR4 bits 0x00000800 and 0x00001000 cause a GPF. The Intel manual is a little unclear, but AFIACT they're reserved, too. Also fix spelling of CR4_RESEVED_BITS. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Trivial: Use standard CR0 flags macros from asm/cpu-features.hRusty Russell
The kernel now has asm/cpu-features.h: use those macros instead of inventing our own. Also spell out definition of CR0_RESEVED_BITS (no code change) and fix typo. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: SMP: Add vcpu_id field in struct vcpuQing He
This patch adds a `vcpu_id' field in `struct vcpu', so we can differentiate BSP and APs without pointer comparison or arithmetic. Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Clean up #includesAvi Kivity
Remove unnecessary ones, and rearange the remaining in the standard order. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: VMX: Remove unnecessary code in vmx_tlb_flush()Avi Kivity
A vmexit implicitly flushes the tlb; the code is bogus. Noted by Shaohua Li. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: VMX: Reinitialize the real-mode tss when entering real modeAvi Kivity
Protected mode code may have corrupted the real-mode tss, so re-initialize it when switching to real mode. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: VMX: Fix interrupt checking on lightweight exitGregory Haskins
With kernel-injected interrupts, we need to check for interrupts on lightweight exits too. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: VMX: Ensure vcpu time stamp counter is monotonousAvi Kivity
If the time stamp counter goes backwards, a guest delay loop can become infinite. This can happen if a vcpu is migrated to another cpu, where the counter has a lower value than the first cpu. Since we're doing an IPI to the first cpu anyway, we can use that to pick up the old tsc, and use that to calculate the adjustment we need to make to the tsc offset. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Initialize the BSP bit in the APIC_BASE msr correctlyAvi Kivity
Needs to be set on vcpu 0 only. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: VMX: Replace memset(<addr>, 0, PAGESIZE) with clear_page(<addr>)Shani Moideen
Signed-off-by: Shani Moideen <shani.moideen@wipro.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Flush remote tlbs when reducing shadow pte permissionsAvi Kivity
When a vcpu causes a shadow tlb entry to have reduced permissions, it must also clear the tlb on remote vcpus. We do that by: - setting a bit on the vcpu that requests a tlb flush before the next entry - if the vcpu is currently executing, we send an ipi to make sure it exits before we continue Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Emulate hlt on real mode for IntelAvi Kivity
This has two use cases: the bios can't boot from disk, and guest smp bootstrap. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Move duplicate halt handling code into kvm_main.cAvi Kivity
Will soon have a thid user. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Replace C code with call to ARRAY_SIZE() macro.Robert P. J. Day
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Lazy guest cr3 switchingAvi Kivity
Switch guest paging context may require us to allocate memory, which might fail. Instead of wiring up error paths everywhere, make context switching lazy and actually do the switch before the next guest entry, where we can return an error if allocation fails. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: VMX: Avoid saving and restoring msr_efer on lightweight vmexitEddie Dong
MSR_EFER.LME/LMA bits are automatically save/restored by VMX hardware, KVM only needs to save NX/SCE bits at time of heavy weight VM Exit. But clearing NX bits in host envirnment may cause system hang if the host page table is using EXB bits, thus we leave NX bits as it is. If Host NX=1 and guest NX=0, we can do guest page table EXB bits check before inserting a shadow pte (though no guest is expecting to see this kind of gp fault). If host NX=0, we present guest no Execute-Disable feature to guest, thus no host NX=0, guest NX=1 combination. This patch reduces raw vmexit time by ~27%. Me: fix compile warnings on i386. Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: VMX: Cleanup redundant code in MSR setEddie Dong
Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: VMX: Avoid saving and restoring msrs on lightweight vmexitEddie Dong
In a lightweight exit (where we exit and reenter the guest without scheduling or exiting to userspace in between), we don't need various msrs on the host, and avoiding shuffling them around reduces raw exit time by 8%. i386 compile fix by Daniel Hecken <dh@bahntechnik.de>. Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: VMX: Handle #SS faults from real modeNitin A Kamble
Instructions with address size override prefix opcode 0x67 Cause the #SS fault with 0 error code in VM86 mode. Forward them to the emulator. Signed-Off-By: Nitin A Kamble <nitin.a.kamble@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>