aboutsummaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2007-05-03KVM: Use slab caches to allocate mmu data structuresAvi Kivity
Better leak detection, statistics, memory use, speed -- goodness all around. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Handle partial pae pdptrAvi Kivity
Some guests (Solaris) do not set up all four pdptrs, but leave some invalid. kvm incorrectly treated these as valid page directories, pinning the wrong pages and causing general confusion. Fix by checking the valid bit of a pae pdpte. This closes sourceforge bug 1698922. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Initialize cr0 to indicate an fpu is presentAvi Kivity
Solaris panics if it sees a cpu with no fpu, and it seems to rely on this bit. Closes sourceforge bug 1698920. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Fix overflow bug in overflow detection codeEric Sesterhenn / Snakebyte
The expression sp - 6 < sp where sp is a u16 is undefined in C since 'sp - 6' is promoted to int, and signed overflow is undefined in C. gcc 4.2 actually warns about it. Replace with a simpler test. Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Use kernel-standard typesAvi Kivity
Noted by Joerg Roedel. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: SVM: enable LBRV virtualization if availableJoerg Roedel
This patch enables the virtualization of the last branch record MSRs on SVM if this feature is available in hardware. It also introduces a small and simple check feature for specific SVM extensions. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Add fpu get/set operationsAvi Kivity
These are really helpful when migrating an floating point app to another machine. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Add physical memory aliasing featureAvi Kivity
With this, we can specify that accesses to one physical memory range will be remapped to another. This is useful for the vga window at 0xa0000 which is used as a movable window into the (much larger) framebuffer. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Simply gfn_to_page()Avi Kivity
Mapping a guest page to a host page is a common operation. Currently, one has first to find the memory slot where the page belongs (gfn_to_memslot), then locate the page itself (gfn_to_page()). This is clumsy, and also won't work well with memory aliases. So simplify gfn_to_page() not to require memory slot translation first, and instead do it internally. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Add mmu cache clear functionDor Laor
Functions that play around with the physical memory map need a way to clear mappings to possibly nonexistent or invalid memory. Both the mmu cache and the processor tlb are cleared. Signed-off-by: Dor Laor <dor.laor@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: x86 emulator: fix bit string operations operand sizeAvi Kivity
On x86, bit operations operate on a string of bits that can reside in multiple words. For example, 'btsl %eax, (blah)' will touch the word at blah+4 if %eax is between 32 and 63. The x86 emulator compensates for that by advancing the operand address by (bit offset / BITS_PER_LONG) and truncating the bit offset to the range (0..BITS_PER_LONG-1). This has a side effect of forcing the operand size to 8 bytes on 64-bit hosts. Now, a 32-bit guest goes and fork()s a process. It write protects a stack page at 0xbffff000 using the 'btr' instruction, at offset 0xffc in the page table, with bit offset 1 (for the write permission bit). The emulator now forces the operand size to 8 bytes as previously described, and an innocent page table update turns into a cross-page-boundary write, which is assumed by the mmu code not to be a page table, so it doesn't actually clear the corresponding shadow page table entry. The guest and host permissions are out of sync and guest memory is corrupted soon afterwards, leading to guest failure. Fix by not using BITS_PER_LONG as the word size; instead use the actual operand size, so we get a 32-bit write in that case. Note we still have to teach the mmu to handle cross-page-boundary writes to guest page table; but for now this allows Damn Small Linux 0.4 (2.4.20) to boot. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Remove debug messageAvi Kivity
No longer interesting. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Use list_move()Avi Kivity
Use list_move() where possible. Noticed by Dor Laor. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Remove unused functionMichal Piotrowski
Remove unused function CC drivers/kvm/svm.o drivers/kvm/svm.c:207: warning: ‘inject_db’ defined but not used Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: SVM: Ensure timestamp counter monotonicityAvi Kivity
When a vcpu is migrated from one cpu to another, its timestamp counter may lose its monotonic property if the host has unsynced timestamp counters. This can confuse the guest, sometimes to the point of refusing to boot. As the rdtsc instruction is rather fast on AMD processors (7-10 cycles), we can simply record the last host tsc when we drop the cpu, and adjust the vcpu tsc offset when we detect that we've migrated to a different cpu. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: MMU: Fix hugepage pdes mapping same physical address with different accessAvi Kivity
The kvm mmu keeps a shadow page for hugepage pdes; if several such pdes map the same physical address, they share the same shadow page. This is a fairly common case (kernel mappings on i386 nonpae Linux, for example). However, if the two pdes map the same memory but with different permissions, kvm will happily use the cached shadow page. If the access through the more permissive pde will occur after the access to the strict pde, an endless pagefault loop will be generated and the guest will make no progress. Fix by making the access permissions part of the cache lookup key. The fix allows Xen pae to boot on kvm and run guest domains. Thanks to Jeremy Fitzhardinge for reporting the bug and testing the fix. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: SVM: forbid guest to execute monitor/mwaitJoerg Roedel
This patch forbids the guest to execute monitor/mwait instructions on SVM. This is necessary because the guest can execute these instructions if they are available even if the kvm cpuid doesn't report its existence. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Handle writes to MCG_STATUS msrSergey Kiselev
Some older (~2.6.7) kernels write MCG_STATUS register during kernel boot (mce_clear_all() function, called from mce_init()). It's not currently handled by kvm and will cause it to inject a GPF. Following patch adds a "nop" handler for this. Signed-off-by: Sergey Kiselev <sergey.kiselev@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Remove unused and write-only variablesAvi Kivity
Trivial cleanup. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Don't allow the guest to turn off the cpu cacheAvi Kivity
The cpu cache is a host resource; the guest should not be able to turn it off (even for itself). Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Hack real-mode segments on vmx from KVM_SET_SREGSAvi Kivity
As usual, we need to mangle segment registers when emulating real mode as vm86 has specific constraints. We special case the reset segment base, and set the "access rights" (or descriptor flags) to vm86 comaptible values. This fixes reboot on vmx. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Modify guest segments after potentially switching modesAvi Kivity
The SET_SREGS ioctl modifies both cr0.pe (real mode/protected mode) and guest segment registers. Since segment handling is modified by the mode on Intel procesors, update the segment registers after the mode switch has taken place. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Remove set_cr0_no_modeswitch() arch opAvi Kivity
set_cr0_no_modeswitch() was a hack to avoid corrupting segment registers. As we now cache the protected mode values on entry to real mode, this isn't an issue anymore, and it interferes with reboot (which usually _is_ a modeswitch). Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Workaround vmx inability to virtualize the reset stateAvi Kivity
The reset state has cs.selector == 0xf000 and cs.base == 0xffff0000, which aren't compatible with vm86 mode, which is used for real mode virtualization. When we create a vcpu, we set cs.base to 0xf0000, but if we get there by way of a reset, the values are inconsistent and vmx refuses to enter guest mode. Workaround by detecting the state and munging it appropriately. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: MMU: Remove global pte trackingAvi Kivity
The initial, noncaching, version of the kvm mmu flushed the all nonglobal shadow page table translations (much like a native tlb flush). The new implementation flushes translations only when they change, rendering global pte tracking superfluous. This removes the unused tracking mechanism and storage space. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: MMU: Remove unnecessary check for pdptr accessAvi Kivity
We already special case the pdptr access, so no need to check it again. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Avoid guest virtual addresses in string pio userspace interfaceAvi Kivity
The current string pio interface communicates using guest virtual addresses, relying on userspace to translate addresses and to check permissions. This interface cannot fully support guest smp, as the check needs to take into account two pages at one in case an unaligned string transfer straddles a page boundary. Change the interface not to communicate guest addresses at all; instead use a buffer page (mmaped by userspace) and do transfers there. The kernel manages the virtual to physical translation and can perform the checks atomically by taking the appropriate locks. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Future-proof argument-less ioctlsAvi Kivity
Some ioctls ignore their arguments. By requiring them to be zero now, we allow a nonzero value to have some special meaning in the future. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Allow kernel to select size of mmap() bufferAvi Kivity
This allows us to store offsets in the kernel/user kvm_run area, and be sure that userspace has them mapped. As offsets can be outside the kvm_run struct, userspace has no way of knowing how much to mmap. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Add guest mode signal maskAvi Kivity
Allow a special signal mask to be used while executing in guest mode. This allows signals to be used to interrupt a vcpu without requiring signal delivery to a userspace handler, which is quite expensive. Userspace still receives -EINTR and can get the signal via sigwait(). Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Initialize the apic_base msr on svm tooAvi Kivity
Older userspace didn't care, but newer userspace (with the cpuid changes) does. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Add a special exit reason when exiting due to an interruptAvi Kivity
This is redundant, as we also return -EINTR from the ioctl, but it allows us to examine the exit_reason field on resume without seeing old data. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Fold kvm_run::exit_type into kvm_run::exit_reasonAvi Kivity
Currently, userspace is told about the nature of the last exit from the guest using two fields, exit_type and exit_reason, where exit_type has just two enumerations (and no need for more). So fold exit_type into exit_reason, reducing the complexity of determining what really happened. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Allow userspace to process hypercalls which have no kernel handlerAvi Kivity
This is useful for paravirtualized graphics devices, for example. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Add method to check for backwards-compatible API extensionsAvi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Remove the 'emulated' field from the userspace interfaceAvi Kivity
We no longer emulate single instructions in userspace. Instead, we service mmio or pio requests. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Handle cpuid in the kernel instead of punting to userspaceAvi Kivity
KVM used to handle cpuid by letting userspace decide what values to return to the guest. We now handle cpuid completely in the kernel. We still let userspace decide which values the guest will see by having userspace set up the value table beforehand (this is necessary to allow management software to set the cpu features to the least common denominator, so that live migration can work). The motivation for the change is that kvm kernel code can be impacted by cpuid features, for example the x86 emulator. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Do not communicate to userspace through cpu registers during PIOAvi Kivity
Currently when passing the a PIO emulation request to userspace, we rely on userspace updating %rax (on 'in' instructions) and %rsi/%rdi/%rcx (on string instructions). This (a) requires two extra ioctls for getting and setting the registers and (b) is unfriendly to non-x86 archs, when they get kvm ports. So fix by doing the register fixups in the kernel and passing to userspace only an abstract description of the PIO to be done. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Use a shared page for kernel/user communication when runing a vcpuAvi Kivity
Instead of passing a 'struct kvm_run' back and forth between the kernel and userspace, allocate a page and allow the user to mmap() it. This reduces needless copying and makes the interface expandable by providing lots of free space. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Fix bogus sign extension in mmu mapping auditAvi Kivity
When auditing a 32-bit guest on a 64-bit host, sign extension of the page table directory pointer table index caused bogus addresses to be shown on audit errors. Fix by declaring the index unsigned. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Use own minor numberAvi Kivity
Use the minor number (232) allocated to kvm by lanana. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Use the generic skip_emulated_instruction() in hypercall codeDor Laor
Instead of twiddling the rip registers directly, use the skip_emulated_instruction() function to do that for us. Signed-off-by: Dor Laor <dor.laor@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03KVM: Fix guest register corruption on paravirt hypercallDor Laor
The hypercall code mixes up the ->cache_regs() and ->decache_regs() callbacks, resulting in guest register corruption. Signed-off-by: Dor Laor <dor.laor@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-04-30libata: honour host controllers that want just one hostLinus Torvalds
The Marvell IDE interface on my machine would hit a BUG_ON() in lib/iomem.c because it was calling ata_pci_init_one() specifying just a single port on the host, but that would actually end up trying to initialize two ports, the second one with bogus information. This fixes "ata_pci_init_one()" so that it actually passes down the n_ports variable that it got from the low-level driver to the host allocation routine ("ata_host_alloc_pinfo()"), which results in the ATA layer actually having the correct port number information. And in order to make it all work, I also needed to fix a few places that had incorrectly hard-coded the fact that a host always had exactly two ports (both ata_pci_init_bmdma() and ata_request_legacy_irqs() would just always iterate over both ports). Acked-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-30power management: implement pm_ops.valid for everybodyJohannes Berg
Almost all users of pm_ops only support mem sleep, don't check in .valid and don't reject any others in .prepare so users can be confused if they check /sys/power/state, especially when new states are added (these would then result in s-t-r although they're supposed to be something different). This patch implements a generic pm_valid_only_mem function that is then exported for users and puts it to use in almost all existing pm_ops. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Cc: David Brownell <david-b@pacbell.net> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: linux-pm@lists.linux-foundation.org Cc: Len Brown <lenb@kernel.org> Acked-by: Russell King <rmk@arm.linux.org.uk> Cc: Greg KH <greg@kroah.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-30Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6: (56 commits) ieee1394: remove garbage from Kconfig ieee1394: more help in Kconfig ieee1394: ohci1394: Fix mistake in printk message. ieee1394: ohci1394: remove unnecessary rcvPhyPkt bit flipping in LinkControl register ieee1394: ohci1394: fix cosmetic problem in error logging ieee1394: eth1394: send async streams at S100 on 1394b buses ieee1394: eth1394: fix error path in module_init ieee1394: eth1394: correct return codes in hard_start_xmit ieee1394: eth1394: hard_start_xmit is called in atomic context ieee1394: eth1394: some conditions are unlikely ieee1394: eth1394: clean up fragment_overlap ieee1394: eth1394: don't use alloc_etherdev ieee1394: eth1394: omit useless set_mac_address callback ieee1394: eth1394: CONFIG_INET is always defined ieee1394: eth1394: allow MTU bigger than 1500 ieee1394: unexport highlevel_host_reset ieee1394: eth1394: contain host reset ieee1394: eth1394: shorter error messages ieee1394: eth1394: correct a memset argument ieee1394: eth1394: refactor .probe and .update ...
2007-04-30Merge branch 'for-linus' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/jikos/hid * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jikos/hid: (21 commits) USB HID: don't warn on idVendor == 0 USB HID: add 'quirks' module parameter USB HID: add support for dynamically-created quirks USB HID: clarify static quirk handling as squirks USB HID: encapsulate quirk handling into hid-quirks.c USB HID: EMS USBII device needs HID_QUIRK_MULTI_INPUT HID: update copyright and authorship macro HID: introduce proper zeroing of unused bits in output reports USB HID: add support for WiseGroup MP-8800 Quad Joypad USB HID: add FF support for Logitech Force 3D Pro Joystick USB HID: numlock quirk for dell W7658 keyboard USB HID: Logitech MX3000 keyboard needs report descriptor quirk USB HID: extend quirk for Logitech S510 keyboard USB HID: usbkbd/usbmouse - handle errors when registering devices USB HID: add QUIRK_HIDDEV for Belkin Flip KVM HID: enable dead keys on a belkin wireless keyboard USB HID: Thustmaster firestorm dual power v1 support USB HID: specify explicit size for hid_blacklist.quirks USB HID: fix retry & reset logic USB HID: consolidate vendor/product ids ...
2007-04-30Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits) [IPV4] SNMP: Support OutMcastPkts and OutBcastPkts [IPV4] SNMP: Support InMcastPkts and InBcastPkts [IPV4] SNMP: Support InTruncatedPkts [IPV4] SNMP: Support InNoRoutes [SNMP]: Add definitions for {In,Out}BcastPkts [TCP] FRTO: RFC4138 allows Nagle override when new data must be sent [TCP] FRTO: Delay skb available check until it's mandatory [XFRM]: Restrict upper layer information by bundle. [TCP]: Catch skb with S+L bugs earlier [PATCH] INET : IPV4 UDP lookups converted to a 2 pass algo [L2TP]: Add the ability to autoload a pppox protocol module. [SKB]: Introduce skb_queue_walk_safe() [AF_IUCV/IUCV]: smp_call_function deadlock [IPV6]: Fix slab corruption running ip6sic [TCP]: Update references in two old comments [XFRM]: Export SPD info [IPV6]: Track device renames in snmp6. [SCTP]: Fix sctp_getsockopt_local_addrs_old() to use local storage. [NET]: Remove NETIF_F_INTERNAL_STATS, default to internal stats. [NETPOLL]: Remove CONFIG_NETPOLL_RX ...
2007-04-30Merge branch 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-blockLinus Torvalds
* 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block: [PATCH] elevator: elv_list_lock does not need irq disabling [BLOCK] Don't pin lots of memory in mempools cfq-iosched: speedup cic rb lookup ll_rw_blk: add io_context private pointer cfq-iosched: get rid of cfqq hash cfq-iosched: tighten queue request overlap condition cfq-iosched: improve sync vs async workloads cfq-iosched: never allow an async queue idling cfq-iosched: get rid of ->dispatch_slice cfq-iosched: don't pass unused preemption variable around cfq-iosched: get rid of ->cur_rr and ->cfq_list cfq-iosched: slice offset should take ioprio into account [PATCH] cfq-iosched: style cleanups and comments cfq-iosched: sort IDLE queues into the rbtree cfq-iosched: sort RT queues into the rbtree [PATCH] cfq-iosched: speed up rbtree handling cfq-iosched: rework the whole round-robin list concept cfq-iosched: minor updates cfq-iosched: development update cfq-iosched: improve preemption for cooperating tasks
2007-04-30Merge branch 'for-2.6.22' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'for-2.6.22' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (255 commits) [POWERPC] Remove dev_dbg redefinition in drivers/ps3/vuart.c [POWERPC] remove kernel module option for booke wdt [POWERPC] Avoid putting cpu node twice [POWERPC] Spinlock initializer cleanup [POWERPC] ppc4xx_sgdma needs dma-mapping.h [POWERPC] arch/powerpc/sysdev/timer.c build fix [POWERPC] get_property cleanups [POWERPC] Remove the unused HTDMSOUND driver [POWERPC] cell: cbe_cpufreq cleanup and crash fix [POWERPC] Declare enable_kernel_spe in a header [POWERPC] Add dt_xlate_addr() to bootwrapper [POWERPC] bootwrapper: CONFIG_ -> CONFIG_DEVICE_TREE [POWERPC] Don't define a custom bd_t for Xilixn Virtex based boards. [POWERPC] Add sane defaults for Xilinx EDK generated xparameters files [POWERPC] Add uartlite boot console driver for the zImage wrapper [POWERPC] Stop using ppc_sys for Xilinx Virtex boards [POWERPC] New registration for common Xilinx Virtex ppc405 platform devices [POWERPC] Merge common virtex header files [POWERPC] Rework Kconfig dependancies for Xilinx Virtex ppc405 platform [POWERPC] Clean up cpufreq Kconfig dependencies ...