aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2008-10-15KVM: VMX: Clean up magic number 0x66 in init_rmode_tssSheng Yang
Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: Reduce stack usage in kvm_pv_mmu_op()Dave Hansen
We're in a hot path. We can't use kmalloc() because it might impact performance. So, we just stick the buffer that we need into the kvm_vcpu_arch structure. This is used very often, so it is not really a waste. We also have to move the buffer structure's definition to the arch-specific x86 kvm header. Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: Reduce stack usage in kvm_arch_vcpu_ioctl()Dave Hansen
[sheng: fix KVM_GET_LAPIC using wrong size] Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: Reduce stack usage in kvm_vcpu_ioctl()Dave Hansen
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: Reduce kvm stack usage in kvm_arch_vm_ioctl()Dave Hansen
On my machine with gcc 3.4, kvm uses ~2k of stack in a few select functions. This is mostly because gcc fails to notice that the different case: statements could have their stack usage combined. It overflows very nicely if interrupts happen during one of these large uses. This patch uses two methods for reducing stack usage. 1. dynamically allocate large objects instead of putting on the stack. 2. Use a union{} member for all of the case variables. This tricks gcc into combining them all into a single stack allocation. (There's also a comment on this) Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: pci device assignmentBen-Ami Yassour
Based on a patch from: Amit Shah <amit.shah@qumranet.com> This patch adds support for handling PCI devices that are assigned to the guest. The device to be assigned to the guest is registered in the host kernel and interrupt delivery is handled. If a device is already assigned, or the device driver for it is still loaded on the host, the device assignment is failed by conveying a -EBUSY reply to the userspace. Devices that share their interrupt line are not supported at the moment. By itself, this patch will not make devices work within the guest. The VT-d extension is required to enable the device to perform DMA. Another alternative is PVDMA. Signed-off-by: Amit Shah <amit.shah@qumranet.com> Signed-off-by: Ben-Ami Yassour <benami@il.ibm.com> Signed-off-by: Weidong Han <weidong.han@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: direct mmio pfn checkBen-Ami Yassour
Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Signed-off-by: Ben-Ami Yassour <benami@il.ibm.com> Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15x86: KVM guest: use paravirt function to calculate cpu khzGlauber Costa
We're currently facing timing problems in guests that do calibration under heavy load, and then the load vanishes. This means we'll have a much lower lpj than we actually should, and delays end up taking less time than they should, which is a nasty bug. Solution is to pass on the lpj value from host to guest, and have it preset. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15x86: paravirt: factor out cpu_khz to common codeGlauber Costa
KVM intends to use paravirt code to calibrate khz. Xen current code will do just fine. So as a first step, factor out code to pvclock.c. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: PIT: fix injection logic and countMarcelo Tosatti
The PIT injection logic is problematic under the following cases: 1) If there is a higher priority vector to be delivered by the time kvm_pit_timer_intr_post is invoked ps->inject_pending won't be set. This opens the possibility for missing many PIT event injections (say if guest executes hlt at this point). 2) ps->inject_pending is racy with more than two vcpus. Since there's no locking around read/dec of pt->pending, two vcpu's can inject two interrupts for a single pt->pending count. Fix 1 by using an irq ack notifier: only reinject when the previous irq has been acked. Fix 2 with appropriate locking around manipulation of pending count and irq_ack by the injection / ack paths. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: irq ack notificationMarcelo Tosatti
Based on a patch from: Ben-Ami Yassour <benami@il.ibm.com> which was based on a patch from: Amit Shah <amit.shah@qumranet.com> Notify IRQ acking on PIC/APIC emulation. The previous patch missed two things: - Edge triggered interrupts on IOAPIC - PIC reset with IRR/ISR set should be equivalent to ack (LAPIC probably needs something similar). Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> CC: Amit Shah <amit.shah@qumranet.com> CC: Ben-Ami Yassour <benami@il.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: Add irq ack notifier listAvi Kivity
This can be used by kvm subsystems that are interested in when interrupts are acked, for example time drift compensation. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: powerpc: Map guest userspace with TID=0 mappingsHollis Blanchard
When we use TID=N userspace mappings, we must ensure that kernel mappings have been destroyed when entering userspace. Using TID=1/TID=0 for kernel/user mappings and running userspace with PID=0 means that userspace can't access the kernel mappings, but the kernel can directly access userspace. The net is that we don't need to flush the TLB on privilege switches, but we do on guest context switches (which are far more infrequent). Guest boot time performance improvement: about 30%. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: ppc: Write only modified shadow entries into the TLB on exitHollis Blanchard
Track which TLB entries need to be written, instead of overwriting everything below the high water mark. Typically only a single guest TLB entry will be modified in a single exit. Guest boot time performance improvement: about 15%. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: ppc: Stop saving host TLB stateHollis Blanchard
We're saving the host TLB state to memory on every exit, but never using it. Originally I had thought that we'd want to restore host TLB for heavyweight exits, but that could actually hurt when context switching to an unrelated host process (i.e. not qemu). Since this decreases the performance penalty of all exits, this patch improves guest boot time by about 15%. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: ppc: guest breakpoint supportHollis Blanchard
Allow host userspace to program hardware debug registers to set breakpoints inside guests. Signed-off-by: Jerone Young <jyoung5@us.ibm.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: Ignore DEBUGCTL MSRs with no effectAlexander Graf
Netware writes to DEBUGCTL and reads from the DEBUGCTL and LAST*IP MSRs without further checks and is really confused to receive a #GP during that. To make it happy we should just make them stubs, which is exactly what SVM already does. Writes to DEBUGCTL that are vendor-specific are resembled to behave as if the virtual CPU does not know them. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: VMX: Avoid vmwrite(HOST_RSP) when possibleAvi Kivity
Usually HOST_RSP retains its value across guest entries. Take advantage of this and avoid a vmwrite() when this is so. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: ppc: trace powerpc instruction emulationChristian Ehrhardt
This patch adds a trace point for the instruction emulation on embedded powerpc utilizing the KVM_TRACE interface. Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: ppc: adds trace points for ppc tlb activityJerone Young
This patch adds trace points to track powerpc TLB activities using the KVM_TRACE infrastructure. Signed-off-by: Jerone Young <jyoung5@us.ibm.com> Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: ppc: enable KVM_TRACE building for powerpcJerone Young
This patch enables KVM_TRACE to build for PowerPC arch. This means just adding sections to Kconfig and Makefile. Signed-off-by: Jerone Young <jyoung5@us.ibm.com> Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: kvmtrace: replace get_cycles with ktime_get v3Christian Ehrhardt
The current kvmtrace code uses get_cycles() while the interpretation would be easier using using nanoseconds. ktime_get() should give at least the same accuracy as get_cycles on all architectures (even better on 32bit archs) but at a better unit (e.g. comparable between hosts with different frequencies. [avi: avoid ktime_t in public header] Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: kvmtrace: Remove use of bit fields in kvm trace structureChristian Ehrhardt
This patch fixes kvmtrace use on big endian systems. When using bit fields the compiler will lay data out in the wrong order expected when laid down into a file. This fixes it by using one variable instead of using bit fields. Signed-off-by: Jerone Young <jyoung5@us.ibm.com> Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: SVM: Unify register save/restore across 32 and 64 bit hostsAvi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: VMX: Unify register save/restore across 32 and 64 bit hostsAvi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: VMX: Reinject real mode exceptionJan Kiszka
As we execute real mode guests in VM86 mode, exception have to be reinjected appropriately when the guest triggered them. For this purpose the patch adopts the real-mode injection pattern used in vmx_inject_irq to vmx_queue_exception, additionally taking care that the IP is set correctly for #BP exceptions. Furthermore it extends handle_rmode_exception to reinject all those exceptions that can be raised in real mode. This fixes the execution of himem.exe from FreeDOS and also makes its debug.com work properly. Note that guest debugging in real mode is broken now. This has to be fixed by the scheduled debugging infrastructure rework (will be done once base patches for QEMU have been accepted). Signed-off-by: Jan Kiszka <jan.kiszka@web.de> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: Consolidate XX_VECTOR definesJan Kiszka
Signed-off-by: Jan Kiszka <jan.kiszka@web.de> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: Consolidate PIC isr clearing into a functionAvi Kivity
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: VMX: Remove redundant check in handle_rmode_exceptionMohammed Gamal
Since checking for vcpu->arch.rmode.active is already done whenever we call handle_rmode_exception(), checking it inside the function is redundant. Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: VMX: Move interrupt post-processing to vmx_complete_interrupts()Avi Kivity
Instead of looking at failed injections in the vm entry path, move processing to the exit path in vmx_complete_interrupts(). This simplifes the logic and removes any state that is hidden in vmx registers. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: Add a pending interrupt queueAvi Kivity
Similar to the exception queue, this hold interrupts that have been accepted by the virtual processor core but not yet injected. Not yet used. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: VMX: Fix pending exception processingAvi Kivity
The vmx code assumes that IDT-Vectoring can only be set when an exception is injected due to the exception in question. That's not true, however: if the exception is injected correctly, and later another exception occurs but its delivery is blocked due to a fault, then we will incorrectly assume the first exception was not delivered. Fix by unconditionally dequeuing the pending exception, and requeuing it (or the second exception) if we see it in the IDT-Vectoring field. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: Clear exception queue before emulating an instructionAvi Kivity
If we're emulating an instruction, either it will succeed, in which case any previously queued exception will be spurious, or we will requeue the same exception. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: VMX: Move nmi injection failure processing to vm exit pathAvi Kivity
Instead of processing nmi injection failure in the vm entry path, move it to the vm exit path (vm_complete_interrupts()). This separates nmi injection from nmi post-processing, and moves the nmi state from the VT state into vcpu state (new variable nmi_injected specifying an injection in progress). Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: Move NMI IRET fault processing to new vmx_complete_interrupts()Avi Kivity
Currently most interrupt exit processing is handled on the entry path, which is confusing. Move the NMI IRET fault processing to a new function, vmx_complete_interrupts(), which is called on the vmexit path. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: MMU: Simplify kvm_mmu_zap_page()Avi Kivity
The twisty maze of conditionals can be reduced. [joerg: fix tlb flushing] Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: MMU: Separate the code for unlinking a shadow page from its parentsAvi Kivity
Place into own function, in preparation for further cleanups. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: Introduce kvm_set_irq to inject interrupts in guestsAmit Shah
This function injects an interrupt into the guest given the kvm struct, the (guest) irq number and the interrupt level. Signed-off-by: Amit Shah <amit.shah@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: Move KVM TRACE DEFINITIONS to common headerHollis Blanchard
Move KVM trace definitions from x86 specific kvm headers to common kvm headers to create a cross-architecture numbering scheme for trace events. This means the kvmtrace_format userspace tool won't need to know which architecture produced the log file being processed. Signed-off-by: Jerone Young <jyoung5@us.ibm.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: x86: accessors for guest registersMarcelo Tosatti
As suggested by Avi, introduce accessors to read/write guest registers. This simplifies the ->cache_regs/->decache_regs interface, and improves register caching which is important for VMX, where the cost of vmcs_read/vmcs_write is significant. [avi: fix warnings] Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15KVM: VMX: Rename misnamed msr bitsSheng Yang
MSR_IA32_FEATURE_LOCKED is just a bit in fact, which shouldn't be prefixed with MSR_. So is MSR_IA32_FEATURE_VMXON_ENABLED. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-14Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6Linus Torvalds
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6: i2c-viapro: Add support for SMBus Process Call transactions i2c: Restore i2c_smbus_process_call function i2c: Do earlier driver model init i2c: Only build Tyan SMBus mux drivers on x86 i2c: Guard against oopses from bad init sequences i2c: Document the implementation details of the /dev interface i2c: Improve dev-interface documentation i2c-parport-light: Don't register a platform device resource hwmon: (dme1737) Convert to a new-style i2c driver hwmon: (dme1737) Be less i2c-centric i2c/tps65010: Vibrator hookup to gpiolib i2c-viapro: Add VX800/VX820 support i2c: Renesas Highlander FPGA SMBus support i2c-pca-isa: Don't grab arbitrary resources i2c/isp1301_omap: Convert to a new-style i2c driver, part 2 i2c/isp1301_omap: Convert to a new-style i2c driver, part 1
2008-10-14Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (55 commits) HID: build drivers for all quirky devices by default HID: add missing blacklist entry for Apple ATV ircontrol HID: add support for Bright ABNT2 brazilian device HID: Don't let Avermedia Radio FM800 be handled by usb hid drivers HID: fix numlock led on Dell device 0x413c/0x2105 HID: remove warn() macro from usb hid drivers HID: remove info() macro from usb HID drivers HID: add appletv IR receiver quirk HID: fix a lockup regression when using force feedback on a PID device HID: hiddev.h: Fix example code. HID: hiddev.h: Fix mixed space and tabs in example code. HID: convert to dev_* prints HID: remove hid-ff HID: move zeroplus FF processing HID: move thrustmaster FF processing HID: move pantherlord FF processing HID: fix incorrent length condition in hidraw_write() HID: fix tty<->hid deadlock HID: ignore iBuddy devices HID: report descriptor fix for remaining MacBook JIS keyboards ...
2008-10-14Merge branch 'upstream-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (56 commits) ocfs2: Make cached block reads the common case. ocfs2: Kill the last naked wait_on_buffer() for cached reads. ocfs2: Move ocfs2_bread() into dir.c ocfs2: Simplify ocfs2_read_block() ocfs2: Require an inode for ocfs2_read_block(s)(). ocfs2: Separate out sync reads from ocfs2_read_blocks() ocfs2: Refactor xattr list and remove ocfs2_xattr_handler(). ocfs2: Calculate EA hash only by its suffix. ocfs2: Move trusted and user attribute support into xattr.c ocfs2: Uninline ocfs2_xattr_name_hash() ocfs2: Don't check for NULL before brelse() ocfs2: use smaller counters in ocfs2_remove_xattr_clusters_from_cache ocfs2: Documentation update for user_xattr / nouser_xattr mount options ocfs2: make la_debug_mutex static ocfs2: Remove pointless !! ocfs2: Add empty bucket support in xattr. ocfs2/xattr.c: Fix a bug when inserting xattr. ocfs2: Add xattr mount option in ocfs2_show_options() ocfs2: Switch over to JBD2. ocfs2: Add the 'inode64' mount option. ...
2008-10-14rtc-cmos: look for PNP RTC first, then for platform RTCBjorn Helgaas
We shouldn't rely on "pnp_platform_devices" to tell us whether there is a PNP RTC device. I introduced "pnp_platform_devices", but I think it was a mistake. All it tells us is whether we found any PNPBIOS or PNPACPI devices. Many machines have some PNP devices, but do not describe the RTC via PNP. On those machines, we need to do the platform driver probe to find the RTC. We should just register the PNP driver and see whether it claims anything. If we don't find a PNP RTC, fall back to the platform driver probe. This (in conjunction with the arch/x86/kernel/rtc.c patch to add a platform RTC device when PNP doesn't have one) should resolve these issues: http://bugzilla.kernel.org/show_bug.cgi?id=11580 https://bugzilla.redhat.com/show_bug.cgi?id=451188 Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: David Brownell <dbrownell@users.sourceforge.net> Reported-by: Rik Theys <rik.theys@esat.kuleuven.be> Reported-by: shr_msn@yahoo.com.tw Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-14x86: register a platform RTC device if PNP doesn't describe itBjorn Helgaas
Most if not all x86 platforms have an RTC device, but sometimes the RTC is not exposed as a PNP0b00/PNP0b01/PNP0b02 device in PNPBIOS or ACPI: http://bugzilla.kernel.org/show_bug.cgi?id=11580 https://bugzilla.redhat.com/show_bug.cgi?id=451188 It's best if we can discover the RTC via PNP because then we know which flavor of device it is, where it lives, and which IRQ it uses. But if we can't, we should register a platform device using the compiled-in RTC_PORT/RTC_IRQ resource assumptions. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: David Brownell <dbrownell@users.sourceforge.net> Reported-by: Rik Theys <rik.theys@esat.kuleuven.be> Reported-by: shr_msn@yahoo.com.tw Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-14rtc-cmos: move wake setup from ACPI glue into RTC driverBjorn Helgaas
Move rtc_wake_setup() from drivers/acpi/glue.c into the RTC driver in drivers/rtc/rtc-cmos.c. This removes the ordering constraint between the module_init(acpi_rtc_init) and the cmos_do_probe() code that depends on it. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-14HID: build drivers for all quirky devices by defaultJiri Kosina
Once kernel configuration has CONFIG_HID turned on, let also all the specialized drivers for quirky devices to be built (unless CONFIG_EMBEDDED is specified), as usually users don't care that much which driver gives them the functionality, but when they want generic support, they probably want to have support for all the quirky devices as well. Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2008-10-14HID: add missing blacklist entry for Apple ATV ircontrolJiri Kosina
This device is already handled by hid-apple driver, but the blacklist entry was missing in generic driver. Reported-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2008-10-14HID: add support for Bright ABNT2 brazilian deviceMauro Carvalho Chehab
This keyboard needs to reset the LEDS during probe. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>