aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/kvm/x86.c
AgeCommit message (Collapse)Author
2009-09-10Revert "KVM: x86: check for cr3 validity in ioctl_set_sregs"Jan Kiszka
This reverts commit 6c20e1442bb1c62914bb85b7f4a38973d2a423ba. To my understanding, it became obsolete with the advent of the more robust check in mmu_alloc_roots (89da4ff17f). Moreover, it prevents the conceptually safe pattern 1. set sregs 2. register mem-slots 3. run vcpu by setting a sticky triple fault during step 1. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: handle AMD microcode MSRAndre Przywara
Windows 7 tries to update the CPU's microcode on some processors, so we ignore the MSR write here. The patchlevel register is already handled (returning 0), because the MSR number is the same as Intel's. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: x2apic interface to lapicGleb Natapov
This patch implements MSR interface to local apic as defines by x2apic Intel specification. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Add Directed EOI support to APIC emulationGleb Natapov
Directed EOI is specified by x2APIC, but is available even when lapic is in xAPIC mode. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Trace mmioAvi Kivity
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Ignore PCI ECS I/O enablementAndre Przywara
Linux guests will try to enable access to the extended PCI config space via the I/O ports 0xCF8/0xCFC on AMD Fam10h CPU. Since we (currently?) don't use ECS, simply ignore write and read attempts. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: remove in_range from io devicesMichael S. Tsirkin
This changes bus accesses to use high-level kvm_io_bus_read/kvm_io_bus_write functions. in_range now becomes unused so it is removed from device ops in favor of read/write callbacks performing range checks internally. This allows aliasing (mostly for in-kernel virtio), as well as better error handling by making it possible to pass errors up to userspace. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: switch pit creation to slots_lockMichael S. Tsirkin
switch pit creation to slots_lock. slots_lock is already taken for read everywhere, so we only need to take it for write when creating pit. This is in preparation to removing in_range and kvm->lock around it. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: introduce module parameter for ignoring unknown MSRs accessesAndre Przywara
KVM will inject a #GP into the guest if that tries to access unhandled MSRs. This will crash many guests. Although it would be the correct way to actually handle these MSRs, we introduce a runtime switchable module param called "ignore_msrs" (defaults to 0). If this is Y, unknown MSR reads will return 0, while MSR writes are simply dropped. In both cases we print a message to dmesg to inform the user about that. You can change the behaviour at any time by saying: # echo 1 > /sys/modules/kvm/parameters/ignore_msrs Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: ignore reads from AMDs C1E enabled MSRAndre Przywara
If the Linux kernel detects an C1E capable AMD processor (K8 RevF and higher), it will access a certain MSR on every attempt to go to halt. Explicitly handle this read and return 0 to let KVM run a Linux guest with the native AMD host CPU propagated to the guest. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: ignore AMDs HWCR register access to set the FFDIS bitAndre Przywara
Linux tries to disable the flush filter on all AMD K8 CPUs. Since KVM does not handle the needed MSR, the injected #GP will panic the Linux kernel. Ignore setting of the HWCR.FFDIS bit in this MSR to let Linux boot with an AMD K8 family guest CPU. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: x86: missing locking in PIT/IRQCHIP/SET_BSP_CPU ioctl pathsMarcelo Tosatti
Correct missing locking in a few places in x86's vm_ioctl handling path. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Allow emulation of syscalls instructions on #UDAndre Przywara
Add the opcodes for syscall, sysenter and sysexit to the list of instructions handled by the undefined opcode handler. Signed-off-by: Christoph Egger <christoph.egger@amd.com> Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: convert custom marker based tracing to event tracesMarcelo Tosatti
This allows use of the powerful ftrace infrastructure. See Documentation/trace/ for usage information. [avi, stephen: various build fixes] [sheng: fix control register breakage] Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Replace pending exception by PF if it happens seriallyGleb Natapov
Replace previous exception with a new one in a hope that instruction re-execution will regenerate lost exception. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Move performance counter MSR access interception to generic x86 pathAndre Przywara
The performance counter MSRs are different for AMD and Intel CPUs and they are chosen mainly by the CPUID vendor string. This patch catches writes to all addresses (regardless of VMX/SVM path) and handles them in the generic MSR handler routine. Writing a 0 into the event select register is something we perfectly emulate ;-), so don't print out a warning to dmesg in this case. This fixes booting a 64bit Windows guest with an AMD CPUID on an Intel host. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Use macro to iterate over vcpus.Gleb Natapov
[christian: remove unused variables on s390] Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Introduce kvm_vcpu_is_bsp() function.Gleb Natapov
Use it instead of open code "vcpu_id zero is BSP" assumption. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: MMU: Adjust pte accessors to explicitly indicate guest or shadow pteAvi Kivity
Since the guest and host ptes can have wildly different format, adjust the pte accessor names to indicate on which type of pte they operate on. No functional changes. Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: switch irq injection/acking data structures to irq_lockMarcelo Tosatti
Protect irq injection/acking data structures with a separate irq_lock mutex. This fixes the following deadlock: CPU A CPU B kvm_vm_ioctl_deassign_dev_irq() mutex_lock(&kvm->lock); worker_thread() -> kvm_deassign_irq() -> kvm_assigned_dev_interrupt_work_handler() -> deassign_host_irq() mutex_lock(&kvm->lock); -> cancel_work_sync() [blocked] [gleb: fix ia64 path] Reported-by: Alex Williamson <alex.williamson@hp.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Cache pdptrsAvi Kivity
Instead of reloading the pdptrs on every entry and exit (vmcs writes on vmx, guest memory access on svm) extract them on demand. Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: cleanup io_device codeGregory Haskins
We modernize the io_device code so that we use container_of() instead of dev->private, and move the vtable to a separate ops structure (theoretically allows better caching for multiple instances of the same ops structure) Signed-off-by: Gregory Haskins <ghaskins@novell.com> Acked-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Allow PIT emulation without speaker portJan Kiszka
The in-kernel speaker emulation is only a dummy and also unneeded from the performance point of view. Rather, it takes user space support to generate sound output on the host, e.g. console beeps. To allow this, introduce KVM_CREATE_PIT2 which controls in-kernel speaker port emulation via a flag passed along the new IOCTL. It also leaves room for future extensions of the PIT configuration interface. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: irqfdGregory Haskins
KVM provides a complete virtual system environment for guests, including support for injecting interrupts modeled after the real exception/interrupt facilities present on the native platform (such as the IDT on x86). Virtual interrupts can come from a variety of sources (emulated devices, pass-through devices, etc) but all must be injected to the guest via the KVM infrastructure. This patch adds a new mechanism to inject a specific interrupt to a guest using a decoupled eventfd mechnanism: Any legal signal on the irqfd (using eventfd semantics from either userspace or kernel) will translate into an injected interrupt in the guest at the next available interrupt window. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Drop interrupt shadow when single stepping should be done only on VMXGleb Natapov
The problem exists only on VMX. Also currently we skip this step if there is pending exception. The patch fixes this too. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: fix cpuid E2BIG handling for extended request typesMark McLoughlin
If we run out of cpuid entries for extended request types we should return -E2BIG, just like we do for the standard request types. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Use MSR names in place of addressJaswinder Singh Rajput
Replace 0xc0010010 with MSR_K8_SYSCFG and 0xc0010015 with MSR_K7_HWCR. Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Add MCE supportHuang Ying
The related MSRs are emulated. MCE capability is exported via extension KVM_CAP_MCE and ioctl KVM_X86_GET_MCE_CAP_SUPPORTED. A new vcpu ioctl command KVM_X86_SETUP_MCE is used to setup MCE emulation such as the mcg_cap. MCE is injected via vcpu ioctl command KVM_X86_SET_MCE. Extended machine-check state (MCG_EXT_P) and CMCI are not implemented. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Replace MSR_IA32_TIME_STAMP_COUNTER with MSR_IA32_TSC of msr-index.hJaswinder Singh Rajput
Use standard msr-index.h's MSR declaration. MSR_IA32_TSC is better than MSR_IA32_TIME_STAMP_COUNTER as it also solves 80 column issue. Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-08-05KVM: x86: verify MTRR/PAT validityMarcelo Tosatti
Do not allow invalid memory types in MTRR/PAT (generating a #GP otherwise). Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-08-05KVM: Fix KVM_GET_MSR_INDEX_LISTJan Kiszka
So far, KVM copied the emulated_msrs (only MSR_IA32_MISC_ENABLE) to a wrong address in user space due to broken pointer arithmetic. This caused subtle corruption up there (missing MSR_IA32_MISC_ENABLE had probably no practical relevance). Moreover, the size check for the user-provided kvm_msr_list forgot about emulated MSRs. Cc: stable@kernel.org Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-28KVM: Ignore reads to K7 EVNTSEL MSRsAmit Shah
In commit 7fe29e0faacb650d31b9e9f538203a157bec821d we ignored the reads to the P6 EVNTSEL MSRs. That fixed crashes on Intel machines. Ignore the reads to K7 EVNTSEL MSRs as well to fix this on AMD hosts. This fixes Kaspersky antivirus crashing Windows guests on AMD hosts. Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10KVM: Disable CR8 intercept if tpr patching is activeGleb Natapov
Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10KVM: Do not migrate pending software interrupts.Gleb Natapov
INTn will be re-executed after migration. If we wanted to migrate pending software interrupt we would need to migrate interrupt type and instruction length too, but we do not have all required info on SVM, so SVM->VMX migration would need to re-execute INTn anyway. To make it simple never migrate pending soft interrupt. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10KVM: Always request IRQ/NMI window if an interrupt is pendingGleb Natapov
Currently they are not requested if there is pending exception. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10KVM: Do not re-execute INTn instruction.Gleb Natapov
Re-inject event instead. This is what Intel suggest. Also use correct instruction length when re-injecting soft fault/interrupt. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10KVM: Remove irq_pending bitmapGleb Natapov
Only one interrupt vector can be injected from userspace irqchip at any given time so no need to store it in a bitmap. Put it into interrupt queue directly. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10KVM: Do not allow interrupt injection from userspace if there is a pending ↵Gleb Natapov
event. The exception will immediately close the interrupt window. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10KVM: x86: check for cr3 validity in mmu_alloc_rootsMarcelo Tosatti
Verify the cr3 address stored in vcpu->arch.cr3 points to an existant memslot. If not, inject a triple fault. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10KVM: MMU: protect kvm_mmu_change_mmu_pages with mmu_lockMarcelo Tosatti
kvm_handle_hva, called by MMU notifiers, manipulates mmu data only with the protection of mmu_lock. Update kvm_mmu_change_mmu_pages callers to take mmu_lock, thus protecting against kvm_handle_hva. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10KVM: Deal with interrupt shadow state for emulated instructionsGlauber Costa
We currently unblock shadow interrupt state when we skip an instruction, but failing to do so when we actually emulate one. This blocks interrupts in key instruction blocks, in particular sti; hlt; sequences If the instruction emulated is an sti, we have to block shadow interrupts. The same goes for mov ss. pop ss also needs it, but we don't currently emulate it. Without this patch, I cannot boot gpxe option roms at vmx machines. This is described at https://bugzilla.redhat.com/show_bug.cgi?id=494469 Signed-off-by: Glauber Costa <glommer@redhat.com> CC: H. Peter Anvin <hpa@zytor.com> CC: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10KVM: Replace ->drop_interrupt_shadow() by ->set_interrupt_shadow()Glauber Costa
This patch replaces drop_interrupt_shadow with the more general set_interrupt_shadow, that can either drop or raise it, depending on its parameter. It also adds ->get_interrupt_shadow() for future use. Signed-off-by: Glauber Costa <glommer@redhat.com> CC: H. Peter Anvin <hpa@zytor.com> CC: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10KVM: use smp_send_reschedule in kvm_vcpu_kickMarcelo Tosatti
KVM uses a function call IPI to cause the exit of a guest running on a physical cpu. For virtual interrupt notification there is no need to wait on IPI receival, or to execute any function. This is exactly what the reschedule IPI does, without the overhead of function IPI. So use it instead of smp_call_function_single in kvm_vcpu_kick. Also change the "guest_mode" variable to a bit in vcpu->requests, and use that to collapse multiple IPI's that would be issued between the first one and zeroing of guest mode. This allows kvm_vcpu_kick to called with interrupts disabled. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10KVM: Update cpuid 1.ecx reportingAvi Kivity
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10KVM: Add AMD cpuid bit: cr8_legacy, abm, misaligned sse, sse4, 3dnow prefetchAvi Kivity
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10KVM: Fix cpuid feature misreportingAvi Kivity
MTRR, PAT, MCE, and MCA are all supported (to some extent) but not reported. Vista requires these features, so if userspace relies on kernel cpuid reporting, it loses support for Vista. Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10KVM: Drop request_nmi from statsJan Kiszka
The stats entry request_nmi is no longer used as the related user space interface was dropped. So clean it up. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10KVM: Replace get_mt_mask_shift with get_mt_maskSheng Yang
Shadow_mt_mask is out of date, now it have only been used as a flag to indicate if TDP enabled. Get rid of it and use tdp_enabled instead. Also put memory type logical in kvm_x86_ops->get_mt_mask(). Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10KVM: Wake up waitqueue before calling get_cpu()Jan Blunck
This moves the get_cpu() call down to be called after we wake up the waiters. Therefore the waitqueue locks can safely be rt mutex. Signed-off-by: Jan Blunck <jblunck@suse.de> Signed-off-by: Sven-Thorsten Dietrich <sven@thebigcorporation.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10KVM: Get rid of get_irq() callbackGleb Natapov
It just returns pending IRQ vector from the queue for VMX/SVM. Get IRQ directly from the queue before migration and put it back after. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>