aboutsummaryrefslogtreecommitdiff
path: root/drivers/kvm/svm.c
AgeCommit message (Collapse)Author
2007-05-21Detach sched.h from mm.hAlexey Dobriyan
First thing mm.h does is including sched.h solely for can_do_mlock() inline function which has "current" dereference inside. By dealing with can_do_mlock() mm.h can be detached from sched.h which is good. See below, why. This patch a) removes unconditional inclusion of sched.h from mm.h b) makes can_do_mlock() normal function in mm/mlock.c c) exports can_do_mlock() to not break compilation d) adds sched.h inclusions back to files that were getting it indirectly. e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were getting them indirectly Net result is: a) mm.h users would get less code to open, read, preprocess, parse, ... if they don't need sched.h b) sched.h stops being dependency for significant number of files: on x86_64 allmodconfig touching sched.h results in recompile of 4083 files, after patch it's only 3744 (-8.3%). Cross-compile tested on all arm defconfigs, all mips defconfigs, all powerpc defconfigs, alpha alpha-up arm i386 i386-up i386-defconfig i386-allnoconfig ia64 ia64-up m68k mips parisc parisc-up powerpc powerpc-up s390 s390-up sparc sparc-up sparc64 sparc64-up um-x86_64 x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig as well as my two usual configs. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-03KVM: SVM: Only save/restore MSRs when neededAnthony Liguori
We only have to save/restore MSR_GS_BASE on every VMEXIT. The rest can be saved/restored when we leave the VCPU. Since we don't emulate the DEBUGCTL MSRs and the guest cannot write to them, we don't have to worry about saving/restoring them at all. This shaves a whopping 40% off raw vmexit costs on AMD. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: VMX: Properly shadow the CR0 register in the vcpu structAnthony Liguori
Set all of the host mask bits for CR0 so that we can maintain a proper shadow of CR0. This exposes CR0.TS, paving the way for lazy fpu handling. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Lazy FPU support for SVMAnthony Liguori
Avoid saving and restoring the guest fpu state on every exit. This shaves ~100 cycles off the guest/host switch. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Per-vcpu statisticsAvi Kivity
Make the exit statistics per-vcpu instead of global. This gives a 3.5% boost when running one virtual machine per core on my two socket dual core (4 cores total) machine. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: SVM: Report hardware exit reason to userspace instead of dmesgAvi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Use kernel-standard typesAvi Kivity
Noted by Joerg Roedel. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: SVM: enable LBRV virtualization if availableJoerg Roedel
This patch enables the virtualization of the last branch record MSRs on SVM if this feature is available in hardware. It also introduces a small and simple check feature for specific SVM extensions. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Remove unused functionMichal Piotrowski
Remove unused function CC drivers/kvm/svm.o drivers/kvm/svm.c:207: warning: ‘inject_db’ defined but not used Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: SVM: Ensure timestamp counter monotonicityAvi Kivity
When a vcpu is migrated from one cpu to another, its timestamp counter may lose its monotonic property if the host has unsynced timestamp counters. This can confuse the guest, sometimes to the point of refusing to boot. As the rdtsc instruction is rather fast on AMD processors (7-10 cycles), we can simply record the last host tsc when we drop the cpu, and adjust the vcpu tsc offset when we detect that we've migrated to a different cpu. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: SVM: forbid guest to execute monitor/mwaitJoerg Roedel
This patch forbids the guest to execute monitor/mwait instructions on SVM. This is necessary because the guest can execute these instructions if they are available even if the kvm cpuid doesn't report its existence. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Remove unused and write-only variablesAvi Kivity
Trivial cleanup. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Don't allow the guest to turn off the cpu cacheAvi Kivity
The cpu cache is a host resource; the guest should not be able to turn it off (even for itself). Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Remove set_cr0_no_modeswitch() arch opAvi Kivity
set_cr0_no_modeswitch() was a hack to avoid corrupting segment registers. As we now cache the protected mode values on entry to real mode, this isn't an issue anymore, and it interferes with reboot (which usually _is_ a modeswitch). Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Avoid guest virtual addresses in string pio userspace interfaceAvi Kivity
The current string pio interface communicates using guest virtual addresses, relying on userspace to translate addresses and to check permissions. This interface cannot fully support guest smp, as the check needs to take into account two pages at one in case an unaligned string transfer straddles a page boundary. Change the interface not to communicate guest addresses at all; instead use a buffer page (mmaped by userspace) and do transfers there. The kernel manages the virtual to physical translation and can perform the checks atomically by taking the appropriate locks. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Initialize the apic_base msr on svm tooAvi Kivity
Older userspace didn't care, but newer userspace (with the cpuid changes) does. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Add a special exit reason when exiting due to an interruptAvi Kivity
This is redundant, as we also return -EINTR from the ioctl, but it allows us to examine the exit_reason field on resume without seeing old data. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Fold kvm_run::exit_type into kvm_run::exit_reasonAvi Kivity
Currently, userspace is told about the nature of the last exit from the guest using two fields, exit_type and exit_reason, where exit_type has just two enumerations (and no need for more). So fold exit_type into exit_reason, reducing the complexity of determining what really happened. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Handle cpuid in the kernel instead of punting to userspaceAvi Kivity
KVM used to handle cpuid by letting userspace decide what values to return to the guest. We now handle cpuid completely in the kernel. We still let userspace decide which values the guest will see by having userspace set up the value table beforehand (this is necessary to allow management software to set the cpu features to the least common denominator, so that live migration can work). The motivation for the change is that kvm kernel code can be impacted by cpuid features, for example the x86 emulator. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Do not communicate to userspace through cpu registers during PIOAvi Kivity
Currently when passing the a PIO emulation request to userspace, we rely on userspace updating %rax (on 'in' instructions) and %rsi/%rdi/%rcx (on string instructions). This (a) requires two extra ioctls for getting and setting the registers and (b) is unfriendly to non-x86 archs, when they get kvm ports. So fix by doing the register fixups in the kernel and passing to userspace only an abstract description of the PIO to be done. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Use the generic skip_emulated_instruction() in hypercall codeDor Laor
Instead of twiddling the rip registers directly, use the skip_emulated_instruction() function to do that for us. Signed-off-by: Dor Laor <dor.laor@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-03-04KVM: Per-vcpu inodesAvi Kivity
Allocate a distinct inode for every vcpu in a VM. This has the following benefits: - the filp cachelines are no longer bounced when f_count is incremented on every ioctl() - the API and internal code are distinctly clearer; for example, on the KVM_GET_REGS ioctl, there is no need to copy the vcpu number from userspace and then copy the registers back; the vcpu identity is derived from the fd used to make the call Right now the performance benefits are completely theoretical since (a) we don't support more than one vcpu per VM and (b) virtualization hardware inefficiencies completely everwhelm any cacheline bouncing effects. But both of these will change, and we need to prepare the API today. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-03-04KVM: SVM: intercept SMI to handle it at host levelJoerg Roedel
This patch changes the SVM code to intercept SMIs and handle it outside the guest. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-03-04KVM: svm: init cr0 with the wp bit setAvi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-03-04KVM: Wire up hypercall handlers to a central arch-independent locationAvi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-03-04KVM: Add hypercall host support for svmAvi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-03-04KVM: add MSR based hypercall APIIngo Molnar
This adds a special MSR based hypercall API to KVM. This is to be used by paravirtual kernels and virtual drivers. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-03-04KVM: Use ARRAY_SIZE macro instead of manual calculation.Ahmed S. Darwish
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com> Signed-off-by: Dor Laor <dor.laor@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-03-04KVM: CosmeticsAvi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-02-12[PATCH] KVM: cpu hotplug supportAvi Kivity
On hotplug, we execute the hardware extension enable sequence. On unplug, we decache any vcpus that last ran on the exiting cpu, and execute the hardware extension disable sequence. Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12[PATCH] kvm: SVM: Hack initial cpu csbase to be consistent with intelAvi Kivity
This allows us to run the mmu testsuite on amd. Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12[PATCH] kvm: Fix mmu going crazy of guest sets cr0.wp == 0Avi Kivity
The kvm mmu relies on cr0.wp being set even if the guest does not set it. The vmx code correctly forces cr0.wp at all times, the svm code does not, so it can't boot solaris without this patch. Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-09[PATCH] kvm: NULL noise removalAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-30[PATCH] Fix "CONFIG_X86_64_" typo in drivers/kvm/svm.cRobert P. J. Day
Fix what looks like an obvious typo in the file drivers/kvm/svm.c. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Acked-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-26[PATCH] KVM: SVM: Propagate cpu shutdown events to userspaceJoerg Roedel
This patch implements forwarding of SHUTDOWN intercepts from the guest on to userspace on AMD SVM. A SHUTDOWN event occurs when the guest produces a triple fault (e.g. on reboot). This also fixes the bug that a guest reboot actually causes a host reboot under some circumstances. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-26[PATCH] KVM: SVM: Fix SVM idt confusionLeonard Norrgard
There's an obvious typo in svm_{get,set}_idt, causing it to access the ldt instead. Because these functions are only called for save/load on AMD, the bug does not impact normal operation. With the fix, save/load works as expected on AMD hosts. Signed-off-by: Uri Lublin <uril@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-23[PATCH] KVM: fix race between mmio reads and injected interruptsAvi Kivity
The kvm mmio read path looks like: 1. guest read faults 2. kvm emulates read, calls emulator_read_emulated() 3. fails as a read requires userspace help 4. exit to userspace 5. userspace emulates read, kvm sets vcpu->mmio_read_completed 6. re-enter guest, fault again 7. kvm emulates read, calls emulator_read_emulated() 8. succeeds as vcpu->mmio_read_emulated is set 9. instruction completes and guest is resumed A problem surfaces if the userspace exit (step 5) also requests an interrupt injection. In that case, the guest does not re-execute the original instruction, but the interrupt handler. The next time an mmio read is exectued (likely for a different address), step 3 will find vcpu->mmio_read_completed set and return the value read for the original instruction. The problem manifested itself in a few annoying ways: - little squares appear randomly on console when switching virtual terminals - ne2000 fails under nfs read load - rtl8139 complains about "pci errors" even though the device model is incapable of issuing them. Fix by skipping interrupt injection if an mmio read is pending. A better fix is to avoid re-entry into the guest, and re-emulating immediately instead. However that's a bit more complex. Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-11[PATCH] KVM: add VM-exit profilingIngo Molnar
This adds the profile=kvm boot option, which enables KVM to profile VM exits. Use: "readprofile -m ./System.map | sort -n" to see the resulting output: [...] 18246 serial_out 148.3415 18945 native_flush_tlb 378.9000 23618 serial_in 212.7748 29279 __spin_unlock_irq 622.9574 43447 native_apic_write 2068.9048 52702 enable_8259A_irq 742.2817 54250 vgacon_scroll 89.3740 67394 ide_inb 6126.7273 79514 copy_page_range 98.1654 84868 do_wp_page 86.6000 140266 pit_read 783.6089 151436 ide_outb 25239.3333 152668 native_io_delay 21809.7143 174783 mask_and_ack_8259A 783.7803 362404 native_set_pte_at 36240.4000 1688747 total 0.5009 Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2007-01-05[PATCH] KVM: Simplify test for interrupt windowDor Laor
No need to test for rflags.if as both VT and SVM specs assure us that on exit caused from interrupt window opening, 'if' is set. Signed-off-by: Dor Laor <dor.laor@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2007-01-05[PATCH] KVM: MMU: Detect oom conditions and propagate error to userspaceAvi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2007-01-05[PATCH] KVM: MMU: Remove invlpg interceptionAvi Kivity
Since we write protect shadowed guest page tables, there is no need to trap page invalidations (the guest will always change the mapping before issuing the invlpg instruction). Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2007-01-05[PATCH] KVM: MMU: oom handlingAvi Kivity
When beginning to process a page fault, make sure we have enough shadow pages available to service the fault. If not, free some pages. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2007-01-05[PATCH] KVM: Prevent stale bits in cr0 and cr4Avi Kivity
Hardware virtualization implementations allow the guests to freely change some of the bits in cr0 and cr4, but trap when changing the other bits. This is useful to avoid excessive exits due to changing, for example, the ts flag. It also means the kvm's copy of cr0 and cr4 may be stale with respect to these bits. most of the time this doesn't matter as these bits are not very interesting. Other times, however (for example when returning cr0 to userspace), they are, so get the fresh contents of these bits from the guest by means of a new arch operation. Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2007-01-05[PATCH] KVM: Improve interrupt responseDor Laor
The current interrupt injection mechanism might delay an interrupt under the following circumstances: - if injection fails because the guest is not interruptible (rflags.IF clear, or after a 'mov ss' or 'sti' instruction). Userspace can check rflags, but the other cases or not testable under the current API. - if injection fails because of a fault during delivery. This probably never happens under normal guests. - if injection fails due to a physical interrupt causing a vmexit so that it can be handled by the host. In all cases the guest proceeds without processing the interrupt, reducing the interactive feel and interrupt throughput of the guest. This patch fixes the situation by allowing userspace to request an exit when the 'interrupt window' opens, so that it can re-inject the interrupt at the right time. Guest interactivity is very visibly improved. Signed-off-by: Dor Laor <dor.laor@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-30[PATCH] KVM: Move common msr handling to arch independent codeAvi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-30[PATCH] KVM: Implement a few system configuration msrsAvi Kivity
Resolves sourceforge bug 1622229 (guest crashes running benchmark software). Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-30[PATCH] KVM: Simplify is_long_mode()Avi Kivity
Instead of doing tricky stuff with the arch dependent virtualization registers, take a peek at the guest's efer. This simlifies some code, and fixes some confusion in the mmu branch. Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-30[PATCH] KVM: Use boot_cpu_data instead of current_cpu_dataAvi Kivity
current_cpu_data invokes smp_processor_id(), which is inadvisable when preemption is enabled. Switch to boot_cpu_data instead. Resolves sourceforge bug 1621401. Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-22[PATCH] KVM: Handle p5 mce msrsMichael Riepe
This allows plan9 to get a little further booting. Signed-off-by: Michael Riepe <michael@mr511.de> Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-22[PATCH] KVM: AMD SVM: Save and restore the floating point unit stateAvi Kivity
Fixes sf bug 1614113 (segfaults in nbench). Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>