aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/include/asm
AgeCommit message (Collapse)Author
2009-09-10KVM: VMX: Introduce KVM_SET_IDENTITY_MAP_ADDR ioctlSheng Yang
Now KVM allow guest to modify guest's physical address of EPT's identity mapping page. (change from v1, discard unnecessary check, change ioctl to accept parameter address rather than value) Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-09-10KVM: Reduce runnability interface with arch support codeGleb Natapov
Remove kvm_cpu_has_interrupt() and kvm_arch_interrupt_allowed() from interface between general code and arch code. kvm_arch_vcpu_runnable() checks for interrupts instead. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Move kvm_cpu_get_interrupt() declaration to x86 codeGleb Natapov
It is implemented only by x86. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: PIT support for HPET legacy modeBeth Kon
When kvm is in hpet_legacy_mode, the hpet is providing the timer interrupt and the pit should not be. So in legacy mode, the pit timer is destroyed, but the *state* of the pit is maintained. So if kvm or the guest tries to modify the state of the pit, this modification is accepted, *except* that the timer isn't actually started. When we exit hpet_legacy_mode, the current state of the pit (which is up to date since we've been accepting modifications) is used to restart the pit timer. The saved_mode code in kvm_pit_load_count temporarily changes mode to 0xff in order to destroy the timer, but then restores the actual value, again maintaining "current" state of the pit for possible later reenablement. [avi: add some reserved storage in the ioctl; make SET_PIT2 IOW] [marcelo: fix memory corruption due to reserved storage] Signed-off-by: Beth Kon <eak@us.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Add Directed EOI support to APIC emulationGleb Natapov
Directed EOI is specified by x2APIC, but is available even when lapic is in xAPIC mode. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Prepare memslot data structures for multiple hugepage sizesJoerg Roedel
[avi: fix build on non-x86] Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: convert custom marker based tracing to event tracesMarcelo Tosatti
This allows use of the powerful ftrace infrastructure. See Documentation/trace/ for usage information. [avi, stephen: various build fixes] [sheng: fix control register breakage] Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10x86: Add definition for IGNNE MSRAlexander Graf
Hyper-V accesses MSR_IGNNE while running under KVM. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: VMX: more MSR_IA32_VMX_EPT_VPID_CAP capability bitsMarcelo Tosatti
Required for EPT misconfiguration handler. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: VMX: Move rmode structure to vmx-specific codeAvi Kivity
rmode is only used in vmx, so move it to vmx.c Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: VMX: Support Unrestricted Guest featureNitin A Kamble
"Unrestricted Guest" feature is added in the VMX specification. Intel Westmere and onwards processors will support this feature. It allows kvm guests to run real mode and unpaged mode code natively in the VMX mode when EPT is turned on. With the unrestricted guest there is no need to emulate the guest real mode code in the vm86 container or in the emulator. Also the guest big real mode code works like native. The attached patch enhances KVM to use the unrestricted guest feature if available on the processor. It also adds a new kernel/module parameter to disable the unrestricted guest feature at the boot time. Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Cache pdptrsAvi Kivity
Instead of reloading the pdptrs on every entry and exit (vmcs writes on vmx, guest memory access on svm) extract them on demand. Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Add MCE supportHuang Ying
The related MSRs are emulated. MCE capability is exported via extension KVM_CAP_MCE and ioctl KVM_X86_GET_MCE_CAP_SUPPORTED. A new vcpu ioctl command KVM_X86_SETUP_MCE is used to setup MCE emulation such as the mcg_cap. MCE is injected via vcpu ioctl command KVM_X86_SET_MCE. Extended machine-check state (MCG_EXT_P) and CMCI are not implemented. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10KVM: Replace MSR_IA32_TIME_STAMP_COUNTER with MSR_IA32_TSC of msr-index.hJaswinder Singh Rajput
Use standard msr-index.h's MSR declaration. MSR_IA32_TSC is better than MSR_IA32_TIME_STAMP_COUNTER as it also solves 80 column issue. Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-08-17x86, pat: Allow ISA memory range uncacheable mapping requestsSuresh Siddha
Max Vozeler reported: > Bug 13877 - bogl-term broken with CONFIG_X86_PAT=y, works with =n > > strace of bogl-term: > 814 mmap2(NULL, 65536, PROT_READ|PROT_WRITE, MAP_SHARED, 4, 0) > = -1 EAGAIN (Resource temporarily unavailable) > 814 write(2, "bogl: mmaping /dev/fb0: Resource temporarily unavailable\n", > 57) = 57 PAT code maps the ISA memory range as WB in the PAT attribute, so that fixed range MTRR registers define the actual memory type (UC/WC/WT etc). But the upper level is_new_memtype_allowed() API checks are failing, as the request here is for UC and the return tracked type is WB (Tracked type is WB as MTRR type for this legacy range potentially will be different for each 4k page). Fix is_new_memtype_allowed() by always succeeding the ISA address range checks, as the null PAT (WB) and def MTRR fixed range register settings satisfy the memory type needs of the applications that map the ISA address range. Reported-and-Tested-by: Max Vozeler <xam@debian.org> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-08-15x86: Fix UV BAU destination subnode idCliff Wickman
The SGI UV Broadcast Assist Unit is used to send TLB shootdown messages to remote nodes of the system. The header of the message must contain the subnode id of the block in the receiving hub that handles such messages. It should always be 0x10, the id of the "LB" block. It had previously been documented as a "must be zero" field. Signed-off-by: Cliff Wickman <cpw@sgi.com> Acked-by: Jack Steiner <steiner@sgi.com> LKML-Reference: <E1Mc1x7-0005Ce-6t@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-04Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Work around compilation warning in arch/x86/kernel/apm_32.c x86, UV: Complete IRQ interrupt migration in arch_enable_uv_irq() x86, 32-bit: Fix double accounting in reserve_top_address() x86: Don't use current_cpu_data in x2apic phys_pkg_id x86, UV: Fix UV apic mode x86, UV: Fix macros for accessing large node numbers x86, UV: Delete mapping of MMR rangs mapped by BIOS x86, UV: Handle missing blade-local memory correctly x86: fix assembly constraints in native_save_fl() x86, msr: execute on the correct CPU subset x86: Fix assert syntax in vmlinux.lds.S x86: Make 64-bit efi_ioremap use ioremap on MMIO regions x86: Add quirk to make Apple MacBook5,2 use reboot=pci x86: Fix CPA memtype reserving in the set_pages_array*() cases x86, pat: Fix set_memory_wc related corruption x86: fix section mismatch for i386 init code
2009-08-04x86, UV: Fix macros for accessing large node numbersJack Steiner
The UV chipset automatically supplies the upper bits on nodes being referenced by MMR accesses. These bit can be deleted from the hub addressing macros. Signed-off-by: Jack Steiner <steiner@sgi.com> LKML-Reference: <20090727143808.GA8076@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-04x86, UV: Handle missing blade-local memory correctlyJack Steiner
UV blades may not have any blade-local memory. Add a field (nid) to the UV blade structure to indicates whether the node has local memory. This is needed by the GRU driver (pushed separately). Signed-off-by: Jack Steiner <steiner@sgi.com> Cc: linux-mm@kvack.org LKML-Reference: <20090727143507.GA7006@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-03x86: fix assembly constraints in native_save_fl()H. Peter Anvin
From Gabe Black in bugzilla 13888: native_save_fl is implemented as follows: 11static inline unsigned long native_save_fl(void) 12{ 13 unsigned long flags; 14 15 asm volatile("# __raw_save_flags\n\t" 16 "pushf ; pop %0" 17 : "=g" (flags) 18 : /* no input */ 19 : "memory"); 20 21 return flags; 22} If gcc chooses to put flags on the stack, for instance because this is inlined into a larger function with more register pressure, the offset of the flags variable from the stack pointer will change when the pushf is performed. gcc doesn't attempt to understand that fact, and address used for pop will still be the same. It will write to somewhere near flags on the stack but not actually into it and overwrite some other value. I saw this happen in the ide_device_add_all function when running in a simulator I work on. I'm assuming that some quirk of how the simulated hardware is set up caused the code path this is on to be executed when it normally wouldn't. A simple fix might be to change "=g" to "=r". Reported-by: Gabe Black <spamforgabe@umich.edu> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Stable Team <stable@kernel.org>
2009-08-03x86: Make 64-bit efi_ioremap use ioremap on MMIO regionsPaul Mackerras
Booting current 64-bit x86 kernels on the latest Apple MacBook (MacBook5,2) via EFI gives the following warning: [ 0.182209] ------------[ cut here ]------------ [ 0.182222] WARNING: at arch/x86/mm/pageattr.c:581 __cpa_process_fault+0x44/0xa0() [ 0.182227] Hardware name: MacBook5,2 [ 0.182231] CPA: called for zero pte. vaddr = ffff8800ffe00000 cpa->vaddr = ffff8800ffe00000 [ 0.182236] Modules linked in: [ 0.182242] Pid: 0, comm: swapper Not tainted 2.6.31-rc4 #6 [ 0.182246] Call Trace: [ 0.182254] [<ffffffff8102c754>] ? __cpa_process_fault+0x44/0xa0 [ 0.182261] [<ffffffff81048668>] warn_slowpath_common+0x78/0xd0 [ 0.182266] [<ffffffff81048744>] warn_slowpath_fmt+0x64/0x70 [ 0.182272] [<ffffffff8102c7ec>] ? update_page_count+0x3c/0x50 [ 0.182280] [<ffffffff818d25c5>] ? phys_pmd_init+0x140/0x22e [ 0.182286] [<ffffffff8102c754>] __cpa_process_fault+0x44/0xa0 [ 0.182292] [<ffffffff8102ce60>] __change_page_attr_set_clr+0x5f0/0xb40 [ 0.182301] [<ffffffff810d1035>] ? vm_unmap_aliases+0x175/0x190 [ 0.182307] [<ffffffff8102d4ae>] change_page_attr_set_clr+0xfe/0x3d0 [ 0.182314] [<ffffffff8102dcca>] _set_memory_uc+0x2a/0x30 [ 0.182319] [<ffffffff8102dd4b>] set_memory_uc+0x7b/0xb0 [ 0.182327] [<ffffffff818afe31>] efi_enter_virtual_mode+0x2ad/0x2c9 [ 0.182334] [<ffffffff818a1c66>] start_kernel+0x2db/0x3f4 [ 0.182340] [<ffffffff818a1289>] x86_64_start_reservations+0x99/0xb9 [ 0.182345] [<ffffffff818a1389>] x86_64_start_kernel+0xe0/0xf2 [ 0.182357] ---[ end trace 4eaa2a86a8e2da22 ]--- [ 0.182982] init_memory_mapping: 00000000ffffc000-0000000100000000 [ 0.182993] 00ffffc000 - 0100000000 page 4k This happens because the 64-bit version of efi_ioremap calls init_memory_mapping for all addresses, regardless of whether they are RAM or MMIO. The EFI tables on this machine ask for runtime access to some MMIO regions: [ 0.000000] EFI: mem195: type=11, attr=0x8000000000000000, range=[0x0000000093400000-0x0000000093401000) (0MB) [ 0.000000] EFI: mem196: type=11, attr=0x8000000000000000, range=[0x00000000ffc00000-0x00000000ffc40000) (0MB) [ 0.000000] EFI: mem197: type=11, attr=0x8000000000000000, range=[0x00000000ffc40000-0x00000000ffc80000) (0MB) [ 0.000000] EFI: mem198: type=11, attr=0x8000000000000000, range=[0x00000000ffc80000-0x00000000ffca4000) (0MB) [ 0.000000] EFI: mem199: type=11, attr=0x8000000000000000, range=[0x00000000ffca4000-0x00000000ffcb4000) (0MB) [ 0.000000] EFI: mem200: type=11, attr=0x8000000000000000, range=[0x00000000ffcb4000-0x00000000ffffc000) (3MB) [ 0.000000] EFI: mem201: type=11, attr=0x8000000000000000, range=[0x00000000ffffc000-0x0000000100000000) (0MB) This arranges to pass the EFI memory type through to efi_ioremap, and makes efi_ioremap use ioremap rather than init_memory_mapping if the type is EFI_MEMORY_MAPPED_IO. With this, the above warning goes away. Signed-off-by: Paul Mackerras <paulus@samba.org> LKML-Reference: <19062.55858.533494.471153@cargo.ozlabs.ibm.com> Cc: Huang Ying <ying.huang@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-07-30lguest: update commentryRusty Russell
Every so often, after code shuffles, I need to go through and unbitrot the Lguest Journey (see drivers/lguest/README). Since we now use RCU in a simple form in one place I took the opportunity to expand that explanation. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Ingo Molnar <mingo@redhat.com> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
2009-07-30lguest: fix comment styleRusty Russell
I don't really notice it (except to begrudge the extra vertical space), but Ingo does. And he pointed out that one excuse of lguest is as a teaching tool, it should set a good example. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Ingo Molnar <mingo@redhat.com>
2009-07-27Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: geode: Mark mfgpt irq IRQF_TIMER to prevent resume failure x86, amd: Don't probe for extended APIC ID if APICs are disabled x86, mce: Rename incorrect macro name "CONFIG_X86_THRESHOLD" x86-64: Fix bad_srat() to clear all state x86, mce: Fix set_trigger() accessor x86: Fix movq immediate operand constraints in uaccess.h x86: Fix movq immediate operand constraints in uaccess_64.h x86: Add reboot fixup for SBC-fitPC2 x86: Include all of .data.* sections in _edata on 64-bit x86: Add quirk for Intel DG45ID board to avoid low memory corruption
2009-07-27mm: Pass virtual address to [__]p{te,ud,md}_free_tlb()Benjamin Herrenschmidt
mm: Pass virtual address to [__]p{te,ud,md}_free_tlb() Upcoming paches to support the new 64-bit "BookE" powerpc architecture will need to have the virtual address corresponding to PTE page when freeing it, due to the way the HW table walker works. Basically, the TLB can be loaded with "large" pages that cover the whole virtual space (well, sort-of, half of it actually) represented by a PTE page, and which contain an "indirect" bit indicating that this TLB entry RPN points to an array of PTEs from which the TLB can then create direct entries. Thus, in order to invalidate those when PTE pages are deleted, we need the virtual address to pass to tlbilx or tlbivax instructions. The old trick of sticking it somewhere in the PTE page struct page sucks too much, the address is almost readily available in all call sites and almost everybody implemets these as macros, so we may as well add the argument everywhere. I added it to the pmd and pud variants for consistency. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: David Howells <dhowells@redhat.com> [MN10300 & FRV] Acked-by: Nick Piggin <npiggin@suse.de> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [s390] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-20x86: Fix movq immediate operand constraints in uaccess.hH. Peter Anvin
The movq instruction, generated by __put_user_asm() when used for 64-bit data, takes a sign-extended immediate ("e") not a zero-extended immediate ("Z"). Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: stable@kernel.org
2009-07-20x86: Fix movq immediate operand constraints in uaccess_64.hUros Bizjak
arch/x86/include/asm/uaccess_64.h uses wrong asm operand constraint ("ir") for movq insn. Since movq sign-extends its immediate operand, "er" constraint should be used instead. Attached patch changes all uses of __put_user_asm in uaccess_64.h to use "er" when "q" insn suffix is involved. Patch was compile tested on x86_64 with defconfig. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: stable@kernel.org
2009-07-17Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: x86/pci: insert ioapic resource before assigning unassigned resources
2009-07-17lguest: fix journeyMatias Zabaljauregui
fix: "make Guest" was complaining about duplicated G:032 Signed-off-by: Matias Zabaljauregui <zabaljauregui@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-07-10Merge branch 'perfcounters-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (50 commits) perf report: Add "Fractal" mode output - support callchains with relative overhead rate perf_counter tools: callchains: Manage the cumul hits on the fly perf report: Change default callchain parameters perf report: Use a modifiable string for default callchain options perf report: Warn on callchain output request from non-callchain file x86: atomic64: Inline atomic64_read() again x86: atomic64: Clean up atomic64_sub_and_test() and atomic64_add_negative() x86: atomic64: Improve atomic64_xchg() x86: atomic64: Export APIs to modules x86: atomic64: Improve atomic64_read() x86: atomic64: Code atomic(64)_read and atomic(64)_set in C not CPP x86: atomic64: Fix unclean type use in atomic64_xchg() x86: atomic64: Make atomic_read() type-safe x86: atomic64: Reduce size of functions x86: atomic64: Improve atomic64_add_return() x86: atomic64: Improve cmpxchg8b() x86: atomic64: Improve atomic64_read() x86: atomic64: Move the 32-bit atomic64_t implementation to a .c file x86: atomic64: The atomic64_t data type should be 8 bytes aligned on 32-bit too perf report: Annotate variable initialization ...
2009-07-10sched: INIT_PREEMPT_COUNTPeter Zijlstra
Pull the initial preempt_count value into a single definition site. Maintainers for: alpha, ia64 and m68k, please have a look, your arch code is funny. The header magic is a bit odd, but similar to the KERNEL_DS one, CPP waits with expanding these macros until the INIT_THREAD_INFO macro itself is expanded, which is in arch/*/kernel/init_task.c where we've already included sched.h so we're good. Cc: tony.luck@intel.com Cc: rth@twiddle.net Cc: geert@linux-m68k.org Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-10x86/pci: insert ioapic resource before assigning unassigned resourcesYinghai Lu
Stephen reported that his DL585 G2 needed noapic after 2.6.22 (?) Dann bisected it down to: commit 30a18d6c3f1e774de656ebd8ff219d53e2ba4029 Date: Tue Feb 19 03:21:20 2008 -0800 x86: multi pci root bus with different io resource range, on 64-bit It turns out that: 1. that AMD-based systems have two HT chains. 2. BIOS doesn't allocate resources for BAR 6 of devices under 8132 etc 3. that multi-peer-root patch will try to split root resources to peer root resources according to PCI conf of NB 4. PCI core assigns unassigned resources, but they overlap with BARs that are used by ioapic addr of io4 and 8132. The reason: at that point ioapic address are not inserted yet. Solution is to insert ioapic resources into the tree a bit earlier. Reported-by: Stephen Frost <sfrost@snowman.net> Reported-and-Tested-by: dann frazier <dannf@hp.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: stable@kernel.org Signed-off-by: Jesse Barnes <jbarnes@jbarnes-g45.(none)>
2009-07-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (29 commits) cxgb3: Fix crash caused by stashing wrong netdev_queue ixgbe: Fix coexistence of FCoE and Flow Director in 82599 memory barrier: adding smp_mb__after_lock net: adding memory barrier to the poll and receive callbacks netpoll: Fix carrier detection for drivers that are using phylib includecheck fix: include/linux, rfkill.h p54: tx refused but queue active Atheros Kconfig needs to be dependent on WLAN_80211 mac80211: fix docbook mac80211_hwsim: avoid NULL access ssb: Add support for 4318E b43: Add support for 4318E zd1211rw: adding SONY IFU-WLM2 (054c:0257) as a zd1211b device zd1211rw: 07b8:6001 is a ZD1211B r6040: bump driver version to 0.24 and date to 08 July 2009 r6040: restore MIER register correctly when IRQ line is shared ipv4: Fix fib_trie rebalancing, part 4 (root thresholds) davinci_emac: fix kernel oops when changing MAC address while interface is down igb: set lan id prior to configuring phy mac80211: minstrel: avoid accessing negative indices in rix_to_ndx() ...
2009-07-09memory barrier: adding smp_mb__after_lockJiri Olsa
Adding smp_mb__after_lock define to be used as a smp_mb call after a lock. Making it nop for x86, since {read|write|spin}_lock() on x86 are full memory barriers. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-06Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: fix usage of bios intcall() x86: Remove unused function lapic_watchdog_ok() x86: Remove unused variable disable_x2apic x86, kvm: Fix section mismatches in kvm.c x86: Add missing annotation to arch/x86/lib/copy_user_64.S::copy_to_user x86: Fix fixmap page order for FIX_TEXT_POKE0,1 amd-iommu: set evt_buf_size correctly amd-iommu: handle alias entries correctly in init code x86: Fix printk call in print_local_apic() x86: Declare check_efer() before it gets used x86: Mark device_nb as static and fix NULL noise x86: Remove double declaration of MSR_P6_EVNTSEL0 and MSR_P6_EVNTSEL1 xen: Use kcalloc() in xen_init_IRQ() x86: Fix fixmap ordering x86: Fix symbol annotation for arch/x86/lib/clear_page_64.S::clear_page_c
2009-07-04x86: atomic64: Inline atomic64_read() againEric Dumazet
Now atomic64_read() is light weight (no register pressure and small icache), we can inline it again. Also use "=&A" constraint instead of "+A" to avoid warning about unitialized 'res' variable. (gcc had to force 0 in eax/edx) $ size vmlinux.prev vmlinux.after text data bss dec hex filename 4908667 451676 1684868 7045211 6b805b vmlinux.prev 4908651 451676 1684868 7045195 6b804b vmlinux.after Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> LKML-Reference: <4A4E1AA2.30002@gmail.com> [ Also fix typo in atomic64_set() export ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-03x86: atomic64: Improve atomic64_xchg()Ingo Molnar
Remove the read-first logic from atomic64_xchg() and simplify the loop. This function was the last user of __atomic64_read() - remove it. Also, change the 'real_val' assumption from the somewhat quirky 1ULL << 32 value to the (just as arbitrary, but simpler) value of 0. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> LKML-Reference: <tip-05118ab8859492ac9ddda0154cf90e37b0a4a0b0@git.kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-03x86: atomic64: Code atomic(64)_read and atomic(64)_set in C not CPPPaul Mackerras
Occasionally we get bugs where atomic_read or atomic_set are used on atomic64_t variables or vice versa. These bugs don't generate warnings on x86 because atomic_read and atomic_set are coded as macros rather than C functions, so we don't get any type-checking on their arguments; similarly for atomic64_read and atomic64_set in 64-bit kernels. This converts them to C functions so that the arguments are type-checked and bugs like this will get caught more easily. It also converts atomic_cmpxchg and atomic_xchg, and atomic64_cmpxchg and atomic64_xchg on 64-bit, so we get type-checking on their arguments too. Compiling a typical 64-bit x86 config, this generates no new warnings, and the vmlinux text is 86 bytes smaller. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-03x86: Remove unused function lapic_watchdog_ok()Jaswinder Singh Rajput
lapic_watchdog_ok() is a global function but no one is using it. Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <1246554335.2242.29.camel@jaswinder.satnam> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-03x86: Fix fixmap page order for FIX_TEXT_POKE0,1Mathieu Desnoyers
Masami reported: > Since the fixmap pages are assigned higher address to lower, > text_poke() has to use it with inverted order (FIX_TEXT_POKE1 > to FIX_TEXT_POKE0). I prefer to just invert the order of the fixmap declaration. It's simpler and more straightforward. Backward fixmaps seems to be used by both x86 32 and 64. It's really rare but a nasty bug, because it only hurts when instructions to patch are crossing a page boundary. If this happens, the fixmap write accesses will spill on the following fixmap, which may very well crash the system. And this does not crash the system, it could leave illegal instructions in place. Thanks Masami for finding this. It seems to have crept into the 2.6.30-rc series, so this calls for a -stable inclusion. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: <stable@kernel.org> LKML-Reference: <20090701213722.GH19926@Krystal> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-03x86: atomic64: Make atomic_read() type-safeIngo Molnar
Linus noticed that atomic64_xchg() uses atomic_read(), which happens to work because atomic_read() is a macro so the .counter value gets u64-read on 32-bit too - but this is really bogus and serious bugs are waiting to happen. Change atomic_read() to be a type-safe inline, and this exposes the atomic64 bogosity as well: arch/x86/lib/atomic64_32.c: In function ‘atomic64_xchg’: arch/x86/lib/atomic64_32.c:39: warning: passing argument 1 of ‘atomic_read’ from incompatible pointer type Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-03x86: atomic64: Move the 32-bit atomic64_t implementation to a .c fileIngo Molnar
Linus noted that the atomic64_t primitives are all inlines currently which is crazy because these functions have a large register footprint anyway. Move them to a separate file: arch/x86/lib/atomic64_32.c Also, while at it, rename all uses of 'unsigned long long' to the much shorter u64. This makes the appearance of the prototypes a lot nicer - and it also uncovered a few bugs where (yet unused) API variants had 'long' as their return type instead of u64. [ More intrusive changes are not yet done in this patch. ] Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-03x86: atomic64: The atomic64_t data type should be 8 bytes aligned on 32-bit tooEric Dumazet
Locked instructions on two cache lines at once are painful. If atomic64_t uses two cache lines, my test program is 10x slower. The chance for that is significant: 4/32 or 12.5%. Make sure an atomic64_t is 8 bytes aligned. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain> [ changed it to __aligned(8) as per Andrew's suggestion ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-02x86: fix power-of-2 round_up/round_down macrosLinus Torvalds
These macros had two bugs: - the type of the mask was not correctly expanded to the full size of the argument being expanded, resulting in possible loss of high bits when mixing types. - the alignment argument was evaluated twice, despite the macro looking like a fancy function (but it really does need to be a macro, since it works on arbitrary integer types) Noticed by Peter Anvin, and with a fix that is a modification of his suggestion (bug noticed by Yinghai Lu). Cc: Peter Anvin <hpa@zytor.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-01perf_counter: Ignore the nmi call frames in the x86-64 backtracesFrederic Weisbecker
About every callchains recorded with perf record are filled up including the internal perfcounter nmi frame: perf_callchain perf_counter_overflow intel_pmu_handle_irq perf_counter_nmi_handler notifier_call_chain atomic_notifier_call_chain notify_die do_nmi nmi We want ignore this frame as it's not interesting for instrumentation. To solve this, we simply ignore every frames from nmi context. New example of "perf report -s sym -c" after this patch: 9.59% [k] search_by_key 4.88% search_by_key reiserfs_read_locked_inode reiserfs_iget reiserfs_lookup do_lookup __link_path_walk path_walk do_path_lookup user_path_at vfs_fstatat vfs_lstat sys_newlstat system_call_fastpath __lxstat 0x406fb1 3.19% search_by_key search_by_entry_key reiserfs_find_entry reiserfs_lookup do_lookup __link_path_walk path_walk do_path_lookup user_path_at vfs_fstatat vfs_lstat sys_newlstat system_call_fastpath __lxstat 0x406fb1 [...] For now this patch only solves the problem in x86-64. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246474930-6088-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-01Fix pci_unmap_addr() et al on i386.David Woodhouse
We can run a 32-bit kernel on boxes with an IOMMU, so we need pci_unmap_addr() etc. to work -- without it, drivers will leak mappings. To be honest, this whole thing looks like it's more pain than it's worth; I'm half inclined to remove the no-op #else case altogether. But this is the minimal fix, which just does the right thing if CONFIG_DMAR is set. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Cc: stable@kernel.org [ for 2.6.30 ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-01x86: Remove double declaration of MSR_P6_EVNTSEL0 and MSR_P6_EVNTSEL1Jaswinder Singh Rajput
MSR_P6_EVNTSEL0 and MSR_P6_EVNTSEL1 is already declared in msr-index.h. Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> LKML-Reference: <1246450778.6940.8.camel@hpdv5.satnam> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-30Merge branch 'perfcounters-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (47 commits) perf report: Add --symbols parameter perf report: Add --comms parameter perf report: Add --dsos parameter perf_counter tools: Adjust only prelinked symbol's addresses perf_counter: Provide a way to enable counters on exec perf_counter tools: Reduce perf stat measurement overhead/skew perf stat: Use percentages for scaling output perf_counter, x86: Update x86_pmu after WARN() perf stat: Micro-optimize the code: memcpy is only required if no event is selected and !null_run perf stat: Improve output perf stat: Fix multi-run stats perf stat: Add -n/--null option to run without counters perf_counter tools: Remove dead code perf_counter: Complete counter swap perf report: Print sorted callchains per histogram entries perf_counter tools: Prepare a small callchain framework perf record: Fix unhandled io return value perf_counter tools: Add alias for 'l1d' and 'l1i' perf-report: Add bare minimum PERF_EVENT_READ parsing perf-report: Add modes for inherited stats and no-samples ...
2009-07-01x86: Fix fixmap orderingJan Beulich
The merge of the 32- and 64-bit fixmap headers made a latent bug on x86-64 a real one: with the right config settings it is possible for FIX_OHCI1394_BASE to overlap the FIX_BTMAP_* range. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: <stable@kernel.org> # for 2.6.30.x LKML-Reference: <4A4A0A8702000078000082E8@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-28Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, delay: tsc based udelay should have rdtsc_barrier x86, setup: correct include file in <asm/boot.h> x86, setup: Fix typo "CONFIG_x86_64" in <asm/boot.h> x86, mce: percpu mcheck_timer should be pinned x86: Add sysctl to allow panic on IOCK NMI error x86: Fix uv bau sending buffer initialization x86, mce: Fix mce resume on 32bit x86: Move init_gbpages() to setup_arch() x86: ensure percpu lpage doesn't consume too much vmalloc space x86: implement percpu_alloc kernel parameter x86: fix pageattr handling for lpage percpu allocator and re-enable it x86: reorganize cpa_process_alias() x86: prepare setup_pcpu_lpage() for pageattr fix x86: rename remap percpu first chunk allocator to lpage x86: fix duplicate free in setup_pcpu_remap() failure path percpu: fix too lazy vunmap cache flushing x86: Set cpu_llc_id on AMD CPUs