aboutsummaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2009-09-10KVM: MMU audit: update count_writable_mappings / count_rmapsMarcelo Tosatti
Under testing, count_writable_mappings returns a value that is 2 integers larger than what count_rmaps returns. Suspicion is that either of the two functions is counting a duplicate (either positively or negatively). Modifying check_writable_mappings_rmap to check for rmap existance on all present MMU pages fails to trigger an error, which should keep Avi happy. Also introduce mmu_spte_walk to invoke a callback on all present sptes visible to the current vcpu, might be useful in the future. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: MMU: introduce is_last_spte helperMarcelo Tosatti
Hiding some of the last largepage / level interaction (which is useful for gbpages and for zero based levels). Also merge the PT_PAGE_TABLE_LEVEL clearing loop in unlink_children. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Return to userspace on emulation failureAvi Kivity
Instead of mindlessly retrying to execute the instruction, report the failure to userspace. Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Use macro to iterate over vcpus.Gleb Natapov
[christian: remove unused variables on s390] Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Break dependency between vcpu index in vcpus array and vcpu_id.Gleb Natapov
Archs are free to use vcpu_id as they see fit. For x86 it is used as vcpu's apic id. New ioctl is added to configure boot vcpu id that was assumed to be 0 till now. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Use pointer to vcpu instead of vcpu_id in timer code.Gleb Natapov
Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Introduce kvm_vcpu_is_bsp() function.Gleb Natapov
Use it instead of open code "vcpu_id zero is BSP" assumption. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: MMU: s/shadow_pte/spte/Avi Kivity
We use shadow_pte and spte inconsistently, switch to the shorter spelling. Rename set_shadow_pte() to __set_spte() to avoid a conflict with the existing set_spte(), and to indicate its lowlevelness. Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: MMU: Adjust pte accessors to explicitly indicate guest or shadow pteAvi Kivity
Since the guest and host ptes can have wildly different format, adjust the pte accessor names to indicate on which type of pte they operate on. No functional changes. Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: MMU: Fix is_dirty_pte()Avi Kivity
is_dirty_pte() is used on guest ptes, not shadow ptes, so it needs to avoid shadow_dirty_mask and use PT_DIRTY_MASK instead. Misdetecting dirty pages could lead to unnecessarily setting the dirty bit under EPT. Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: VMX: Move rmode structure to vmx-specific codeAvi Kivity
rmode is only used in vmx, so move it to vmx.c Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: VMX: Support Unrestricted Guest featureNitin A Kamble
"Unrestricted Guest" feature is added in the VMX specification. Intel Westmere and onwards processors will support this feature. It allows kvm guests to run real mode and unpaged mode code natively in the VMX mode when EPT is turned on. With the unrestricted guest there is no need to emulate the guest real mode code in the vm86 container or in the emulator. Also the guest big real mode code works like native. The attached patch enhances KVM to use the unrestricted guest feature if available on the processor. It also adds a new kernel/module parameter to disable the unrestricted guest feature at the boot time. Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: switch irq injection/acking data structures to irq_lockMarcelo Tosatti
Protect irq injection/acking data structures with a separate irq_lock mutex. This fixes the following deadlock: CPU A CPU B kvm_vm_ioctl_deassign_dev_irq() mutex_lock(&kvm->lock); worker_thread() -> kvm_deassign_irq() -> kvm_assigned_dev_interrupt_work_handler() -> deassign_host_irq() mutex_lock(&kvm->lock); -> cancel_work_sync() [blocked] [gleb: fix ia64 path] Reported-by: Alex Williamson <alex.williamson@hp.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Grab pic lock in kvm_pic_clear_isr_ackMarcelo Tosatti
isr_ack is protected by kvm_pic->lock. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Cleanup LAPIC interfaceJan Kiszka
None of the interface services the LAPIC emulation provides need to be exported to modules, and kvm_lapic_get_base is even totally unused today. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: ppc: e500: Add MMUCFG and PVR emulationLiu Yu
Latest kernel started to use these two registers. Signed-off-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: ppc: e500: Directly pass pvr to guestLiu Yu
Signed-off-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: ppc: e500: Move to Book-3e MMU definitionsLiu Yu
According to commit 70fe3af8403f85196bb74f22ce4813db7dfedc1a. Signed-off-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: VMX: Fix reporting of unhandled EPT violationsAvi Kivity
Instead of returning -ENOTSUPP, exit normally but indicate the hardware exit reason. Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Cache pdptrsAvi Kivity
Instead of reloading the pdptrs on every entry and exit (vmcs writes on vmx, guest memory access on svm) extract them on demand. Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: VMX: Simplify pdptr and cr3 managementAvi Kivity
Instead of reading the PDPTRs from memory after every exit (which is slow and wrong, as the PDPTRs are stored on the cpu), sync the PDPTRs from memory to the VMCS before entry, and from the VMCS to memory after exit. Do the same for cr3. Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: VMX: Avoid duplicate ept tlb flush when setting cr3Avi Kivity
vmx_set_cr3() will call vmx_tlb_flush(), which will flush the ept context. So there is no need to call ept_sync_context() explicitly. Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: do not register i8254 PIO regions until we are initializedGregory Haskins
We currently publish the i8254 resources to the pio_bus before the devices are fully initialized. Since we hold the pit_lock, its probably not a real issue. But lets clean this up anyway. Reported-by: Avi Kivity <avi@redhat.com> Signed-off-by: Gregory Haskins <ghaskins@novell.com> Acked-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: cleanup io_device codeGregory Haskins
We modernize the io_device code so that we use container_of() instead of dev->private, and move the vtable to a separate ops structure (theoretically allows better caching for multiple instances of the same ops structure) Signed-off-by: Gregory Haskins <ghaskins@novell.com> Acked-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: powerpc: fix some init/exit annotationsStephen Rothwell
Fixes a couple of warnings like this one: WARNING: arch/powerpc/kvm/kvm-440.o(.text+0x1e8c): Section mismatch in reference from the function kvmppc_44x_exit() to the function .exit.text:kvmppc_booke_exit() The function kvmppc_44x_exit() references a function in an exit section. Often the function kvmppc_booke_exit() has valid usage outside the exit section and the fix is to remove the __exit annotation of kvmppc_booke_exit. Also add some __init annotations on obvious routines. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: SVM: Fold kvm_svm.h info svm.cAvi Kivity
kvm_svm.h is only included from svm.c, so fold it in. Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: SVM: use explicit 64bit storage for sysenter valuesAndre Przywara
Since AMD does not support sysenter in 64bit mode, the VMCB fields storing the MSRs are truncated to 32bit upon VMRUN/#VMEXIT. So store the values in a separate 64bit storage to avoid truncation. [andre: fix amd->amd migration] Signed-off-by: Christoph Egger <christoph.egger@amd.com> Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: s390: streamline memslot handlingChristian Ehrhardt
This patch relocates the variables kvm-s390 uses to track guest mem addr/size. As discussed dropping the variables at struct kvm_arch level allows to use the common vcpu->request based mechanism to reload guest memory if e.g. changes via set_memory_region. The kick mechanism introduced in this series is used to ensure running vcpus leave guest state to catch the update. Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: s390: fix signal handlingChristian Ehrhardt
If signal pending is true we exit without updating kvm_run, userspace currently just does nothing and jumps to kvm_run again. Since we did not set an exit_reason we might end up with a random one (whatever was the last exit). Therefore it was possible to e.g. jump to the psw position the last real interruption set. Setting the INTR exit reason ensures that no old psw data is swapped in on reentry. Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: s390: infrastructure to kick vcpus out of guest stateChristian Ehrhardt
To ensure vcpu's come out of guest context in certain cases this patch adds a s390 specific way to kick them out of guest context. Currently it kicks them out to rerun the vcpu_run path in the s390 code, but the mechanism itself is expandable and with a new flag we could also add e.g. kicks to userspace etc. Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: ia64: Correct itc_offset calculationsJes Sorensen
Init the itc_offset for all possible vCPUs. The current code by mistake ends up only initializing the offset on vCPU 0. Spotted by Gleb Natapov. Signed-off-by: Jes Sorensen <jes@sgi.com> Acked-by : Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Allow PIT emulation without speaker portJan Kiszka
The in-kernel speaker emulation is only a dummy and also unneeded from the performance point of view. Rather, it takes user space support to generate sound output on the host, e.g. console beeps. To allow this, introduce KVM_CREATE_PIT2 which controls in-kernel speaker port emulation via a flag passed along the new IOCTL. It also leaves room for future extensions of the PIT configuration interface. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: irqfdGregory Haskins
KVM provides a complete virtual system environment for guests, including support for injecting interrupts modeled after the real exception/interrupt facilities present on the native platform (such as the IDT on x86). Virtual interrupts can come from a variety of sources (emulated devices, pass-through devices, etc) but all must be injected to the guest via the KVM infrastructure. This patch adds a new mechanism to inject a specific interrupt to a guest using a decoupled eventfd mechnanism: Any legal signal on the irqfd (using eventfd semantics from either userspace or kernel) will translate into an injected interrupt in the guest at the next available interrupt window. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Move common KVM Kconfig items to new file virt/kvm/KconfigAvi Kivity
Reduce Kconfig code duplication. Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Drop interrupt shadow when single stepping should be done only on VMXGleb Natapov
The problem exists only on VMX. Also currently we skip this step if there is pending exception. The patch fixes this too. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: cleanup arch/x86/kvm/MakefileChristoph Hellwig
Use proper foo-y style list additions to cleanup all the conditionals, move module selection after compound object selection and remove the superflous comment. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: x86 emulator: fix jmp far decoding (opcode 0xea)Avi Kivity
The jump target should not be sign extened; use an unsigned decode flag. Cc: stable@kernel.org Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: x86 emulator: Implement zero-extended immediate decodingAvi Kivity
Absolute jumps use zero extended immediate operands. Cc: stable@kernel.org Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: fix cpuid E2BIG handling for extended request typesMark McLoughlin
If we run out of cpuid entries for extended request types we should return -E2BIG, just like we do for the standard request types. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Use MSR names in place of addressJaswinder Singh Rajput
Replace 0xc0010010 with MSR_K8_SYSCFG and 0xc0010015 with MSR_K7_HWCR. Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Add MCE supportHuang Ying
The related MSRs are emulated. MCE capability is exported via extension KVM_CAP_MCE and ioctl KVM_X86_GET_MCE_CAP_SUPPORTED. A new vcpu ioctl command KVM_X86_SETUP_MCE is used to setup MCE emulation such as the mcg_cap. MCE is injected via vcpu ioctl command KVM_X86_SET_MCE. Extended machine-check state (MCG_EXT_P) and CMCI are not implemented. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Replace MSR_IA32_TIME_STAMP_COUNTER with MSR_IA32_TSC of msr-index.hJaswinder Singh Rajput
Use standard msr-index.h's MSR declaration. MSR_IA32_TSC is better than MSR_IA32_TIME_STAMP_COUNTER as it also solves 80 column issue. Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: VMX: Properly handle software interrupt re-injection in real modeGleb Natapov
When reinjecting a software interrupt or exception, use the correct instruction length provided by the hardware instead of a hardcoded 1. Fixes problems running the suse 9.1 livecd boot loader. Problem introduced by commit f0a3602c20 ("KVM: Move interrupt injection logic to x86.c"). Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-05powerpc: Fix i8259 interrupt driver kernel crash on ML510Roderick Colenbrander
This patch fixes a null pointer exception caused by removal of 'ack()' for level interrupts in the Xilinx interrupt driver. A recent change to the xilinx interrupt controller removed the ack hook for level irqs. Signed-off-by: Roderick Colenbrander <thunderbird2k@gmail.com> Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-05Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] fix csum_ipv6_magic() [IA64] Fix warning in dma-mapping.c
2009-09-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sparc64: Fix bootup with mcount in some configs. sparc64: Kill spurious NMI watchdog triggers by increasing limit to 30 seconds.
2009-09-05Merge branch 'perfcounters-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf_counter/powerpc: Fix cache event codes for POWER7 perf_counter: Fix /0 bug in swcounters perf_counters: Increase paranoia level
2009-09-04sparc64: Fix bootup with mcount in some configs.David S. Miller
Functions invoked early when booting up a cpu can't use tracing because mcount requires a valid 'current_thread_info()' and TLB mappings to be setup. The code path of sun4v_register_mondo_queues --> register_one_mondo is one such case. sun4v_register_mondo_queues already has the necessary 'notrace' annotation, but register_one_mondo does not. Normally register_one_mondo is inlined so the bug doesn't trigger, but with some config/compiler combinations, it won't be so we must properly mark it notrace. While we're here, add 'notrace' annoations to prom_printf and prom_halt so that early error handling won't have the same problem. Reported-by: Alexander Beregalov <a.beregalov@gmail.com> Reported-by: Leif Sawyer <lsawyer@gci.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-03sparc64: Kill spurious NMI watchdog triggers by increasing limit to 30 seconds.David S. Miller
This is a compromise and a temporary workaround for bootup NMI watchdog triggers some people see with qla2xxx devices present. This happens when, for example: CPU 0 is in the driver init and looping submitting mailbox commands to load the firmware, then waiting for completion. CPU 1 is receiving the device interrupts. CPU 1 is where the NMI watchdog triggers. CPU 0 is submitting mailbox commands fast enough that by the time CPU 1 returns from the device interrupt handler, a new one is pending. This sequence runs for more than 5 seconds. The problematic case is CPU 1's timer interrupt running when the barrage of device interrupts begin. Then we have: timer interrupt return for softirq checking pending, thus enable interrupts qla2xxx interrupt return qla2xxx interrupt return ... 5+ seconds pass final qla2xxx interrupt for fw load return run timer softirq return At some point in the multi-second qla2xxx interrupt storm we trigger the NMI watchdog on CPU 1 from the NMI interrupt handler. The timer softirq, once we get back to running it, is smart enough to run the timer work enough times to make up for the missed timer interrupts. However, the NMI watchdogs (both x86 and sparc) use the timer interrupt count to notice the cpu is wedged. But in the above scenerio we'll receive only one such timer interrupt even if we last all the way back to running the timer softirq. The default watchdog trigger point is only 5 seconds, which is pretty low (the softwatchdog triggers at 60 seconds). So increase it to 30 seconds for now. Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-03perf_counter/powerpc: Fix cache event codes for POWER7Paul Mackerras
I had the codes for L1 D-cache load accesses and misses swapped around, and the wrong codes for LL-cache accesses and misses. This corrects them. Reported-by: Corey Ashford <cjashfor@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <stable@kernel.org> LKML-Reference: <19103.8514.709300.585484@cargo.ozlabs.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>