aboutsummaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)Author
2009-04-26Merge branch 'irq-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86/irq: mark NUMA_MIGRATE_IRQ_DESC broken x86, irq: Remove IRQ_DISABLED check in process context IRQ move
2009-04-26Merge branch 'core-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: locking: clarify kernel-taint warning message lockdep, x86: account for irqs enabled in paranoid_exit lockdep: more robust lockdep_map init sequence
2009-04-24Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (34 commits) ACPI, i915: Register ACPI video even when not modesetting Revert "ACPICA: delete check for AML access to port 0x81-83" I/O port protection: update for windows compatibility. sony-laptop: always try to unblock rfkill on load sony-laptop: fix bogus error message display on resume ACPI: EC: Fix ACPI EC resume non-query interrupt message sony-laptop: SNC input event 38 fix sony-laptop: SNC 127 Initialization Fix sony-laptop: Duplicate SNC 127 Event Fix ACPI: prevent processor.max_cstate=0 boot crash ACPI/hpet: prevent boot hang when hpet=force used on ICH-4M ACPI: delete obsolete "bus master activity" proc field ACPI: idle: mark_tsc_unstable() at init-time, not run-time ACPI: add /sys/firmware/acpi/interrupts/sci_not counter ACPI video: fix an error when the brightness levels on AC and on Battery are same acpi-cpufreq: Do not let get_measured perf depend on internal variable acpi-cpufreq: style-only: add parens to math expression acpi-cpufreq: Cleanup: Use printk_once x86, acpi_cpufreq: Fix the NULL pointer dereference in get_measured_perf thinkpad-acpi: bump up version to 0.23 ...
2009-04-24Merge branch 'cpufreq' into releaseLen Brown
2009-04-22KVM: Unregister cpufreq notifier on unloadJan Kiszka
Properly unregister cpufreq notifier on onload if it was registered during init. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-04-22KVM: x86: release time_page on vcpu destructionJoerg Roedel
Not releasing the time_page causes a leak of that page or the compound page it is situated in. Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-04-22KVM: MMU: disable global page optimizationMarcelo Tosatti
Complexity to fix it not worthwhile the gains, as discussed in http://article.gmane.org/gmane.comp.emulators.kvm.devel/28649. Cc: stable@kernel.org Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-04-21FRV: Fix the section attribute on UP DECLARE_PER_CPU()David Howells
In non-SMP mode, the variable section attribute specified by DECLARE_PER_CPU() does not agree with that specified by DEFINE_PER_CPU(). This means that architectures that have a small data section references relative to a base register may throw up linkage errors due to too great a displacement between where the base register points and the per-CPU variable. On FRV, the .h declaration says that the variable is in the .sdata section, but the .c definition says it's actually in the .data section. The linker throws up the following errors: kernel/built-in.o: In function `release_task': kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o To fix this, DECLARE_PER_CPU() should simply apply the same section attribute as does DEFINE_PER_CPU(). However, this is made slightly more complex by virtue of the fact that there are several variants on DEFINE, so these need to be matched by variants on DECLARE. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-21clocksource: pass clocksource to read() callbackMagnus Damm
Pass clocksource pointer to the read() callback for clocksources. This allows us to share the callback between multiple instances. [hugh@veritas.com: fix powerpc build of clocksource pass clocksource mods] [akpm@linux-foundation.org: cleanup] Signed-off-by: Magnus Damm <damm@igel.co.jp> Acked-by: John Stultz <johnstul@us.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-20Separate out common fstatat code into vfs_fstatatOleg Drokin
This is a version incorporating Christoph's suggestion. Separate out common *fstatat functionality into a single function instead of duplicating it all over the code. Signed-off-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-04-19acpi-cpufreq: Do not let get_measured perf depend on internal variableThomas Renninger
Take already available policy->cpuinfo.max_freq and get rid of acpi-cpufreq specific max_freq variable. This implies that P0 is always the highest frequency which should always be true as ACPI spec says: As a result, the zeroth entry describes the highest performance state Signed-off-by: Thomas Renninger <trenn@suse.de> Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-19acpi-cpufreq: style-only: add parens to math expressionThomas Renninger
Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-19acpi-cpufreq: Cleanup: Use printk_onceThomas Renninger
Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-19x86, acpi_cpufreq: Fix the NULL pointer dereference in get_measured_perfPallipadi, Venkatesh
Fix for a regression that was introduced by earlier commit 18b2646fe3babeb40b34a0c1751e0bf5adfdc64c on Mon Apr 6 11:26:08 2009 Regression resulted in the below error happened on systems with software coordination where per_cpu acpi data will not be initiated for secondary CPUs in a P-state domain. On Tue, 2009-04-14 at 23:01 -0700, Zhang, Yanmin wrote: My machine hanged with kernel 2.6.30-rc2 when script read > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor. > > opps happens in get_measured_perf: > > cur.aperf.whole = readin.aperf.whole - > per_cpu(drv_data, cpu)->saved_aperf; > > Because per_cpu(drv_data, cpu)=NULL. > > So function get_measured_perf should check if (per_cpu(drv_data, > cpu)==NULL) > and return 0 if it's NULL. --------------sys log------------------ BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9 PGD a7dd88067 PUD a7ccf5067 PMD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor CPU 0 Modules linked in: video output Pid: 2091, comm: kondemand/0 Not tainted 2.6.30-rc2 #1 MP Server RIP: 0010:[<ffffffff8021af75>] [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9 RSP: 0018:ffff880a7d56de20 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 00000046241a42b6 RCX: ffff88004d219000 RDX: 000000000000b660 RSI: 0000000000000020 RDI: 0000000000000001 RBP: ffff880a7f052000 R08: 00000046241a42b6 R09: ffffffff807639f0 R10: 00000000ffffffea R11: ffffffff802207f4 R12: ffff880a7f052000 R13: ffff88004d20e460 R14: 0000000000ddd5a6 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff88004d200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000020 CR3: 0000000a7f1bf000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kondemand/0 (pid: 2091, threadinfo ffff880a7d56c000, task ffff880a7d4d18c0) Stack: ffff880a7f052078 ffffffff803efd54 00000046241a42b6 000000462ffa9e95 0000000000000001 0000000000000001 00000000ffffffea ffffffff8064f41a 0000000000000012 0000000000000012 ffff880a7f052000 ffffffff80650547 Call Trace: [<ffffffff803efd54>] ? kobject_get+0x12/0x17 [<ffffffff8064f41a>] ? __cpufreq_driver_getavg+0x42/0x57 [<ffffffff80650547>] ? do_dbs_timer+0x147/0x272 [<ffffffff80650400>] ? do_dbs_timer+0x0/0x272 [<ffffffff802474ca>] ? worker_thread+0x15b/0x1f5 [<ffffffff8024a02c>] ? autoremove_wake_function+0x0/0x2e [<ffffffff8024736f>] ? worker_thread+0x0/0x1f5 [<ffffffff80249f0d>] ? kthread+0x54/0x83 [<ffffffff8020c87a>] ? child_rip+0xa/0x20 [<ffffffff80249eb9>] ? kthread+0x0/0x83 [<ffffffff8020c870>] ? child_rip+0x0/0x20 Code: 99 a6 03 00 31 c9 85 c0 0f 85 c3 00 00 00 89 df 4c 8b 44 24 10 48 c7 c2 60 b6 00 00 48 8b 0c fd e0 30 a5 80 4c 89 c3 48 8b 04 0a <48> 2b 58 20 48 8b 44 24 18 48 89 1c 24 48 8b 34 0a 48 2b 46 28 RIP [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9 RSP <ffff880a7d56de20> CR2: 0000000000000020 ---[ end trace 2b8fac9a49e19ad4 ]--- Tested-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-19lguest: fix guest crash on non-linear addresses in gdt pvopsRusty Russell
Fixes guest crash 'lguest: bad read address 0x4800000 len 256' The new per-cpu allocator ends up handing a non-linear address to write_gdt_entry. We do __pa() on it, and hand it to the host, which kills us. I've long wanted to make the hypercall "LOAD_GDT_ENTRY" to match the IDT code, but had no pressing reason until now. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: lguest@ozlabs.org
2009-04-18lockdep, x86: account for irqs enabled in paranoid_exitSteven Rostedt
I hit the check_flags error of lockdep: WARNING: at kernel/lockdep.c:2893 check_flags+0x1a7/0x1d0() [...] hardirqs last enabled at (12567): [<ffffffff8026206a>] local_bh_enable+0xaa/0x110 hardirqs last disabled at (12569): [<ffffffff80610c76>] int3+0x16/0x40 softirqs last enabled at (12566): [<ffffffff80514d2b>] lock_sock_nested+0xfb/0x110 softirqs last disabled at (12568): [<ffffffff8058454e>] tcp_prequeue_process+0x2e/0xa0 The check_flags warning of lockdep tells me that lockdep thought interrupts were disabled, but they were really enabled. The numbers in the above parenthesis show the order of events: 12566: softirqs last enabled: lock_sock_nested 12567: hardirqs last enabled: local_bh_enable 12568: softirqs last disabled: tcp_prequeue_process 12566: hardirqs last disabled: int3 int3 is a breakpoint! Examining this further, I have CONFIG_NET_TCPPROBE enabled which adds break points into the kernel. The paranoid_exit of the return of int3 does not account for enabling interrupts on return to kernel. This code is a bit tricky since it is also used by the nmi handler (when lockdep is off), and we must be careful about the swapgs. We can not call kernel code after the swapgs has been performed. [ Impact: fix lockdep check_flags warning + self-turn-off ] Acked-by: Peter Zijlsta <a.p.zijlstra@chello.nl> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-17Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: fix microcode driver newly spewing warnings x86, PAT: Remove page granularity tracking for vm_insert_pfn maps x86: disable X86_PTRACE_BTS for now x86, documentation: kernel-parameters replace X86-32,X86-64 with X86 x86: pci-swiotlb.c swiotlb_dma_ops should be static x86, PAT: Remove duplicate memtype reserve in devmem mmap x86, PAT: Consolidate code in pat_x_mtrr_type() and reserve_memtype() x86, PAT: Changing memtype to WC ensuring no WB alias x86, PAT: Handle faults cleanly in set_memory_ APIs x86, PAT: Change order of cpa and free in set_memory_wb x86, CPA: Change idmap attribute before ioremap attribute setup
2009-04-17x86/irq: mark NUMA_MIGRATE_IRQ_DESC brokenYinghai Lu
It causes crash on system with lots of cards with MSI-X when irq_balancer enabled... The patches fixing it were both complex and fragile, according to Eric they were also doing quite dangerous things to the hardware. Instead we now have patches that solve this problem via static NUMA node mappings - not dynamic allocation and balancing. The patches are much simpler than this method but are still too large outside of the merge window, so we mark the dynamic balancer as broken for now, and queue up the new approach for v2.6.31. [ Impact: deactivate broken kernel feature ] Reported-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Rusty Russell <rusty@rustcorp.com.au> LKML-Reference: <49E68C41.4020801@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-16Merge branch 'x86/uv' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86/uv' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: UV BAU distribution and payload MMRs x86: UV: BAU partition-relative distribution map x86, uv: add Kconfig dependency on NUMA for UV systems x86: prevent /sys/firmware/sgi_uv from being created on non-uv systems x86, UV: Fix for nodes with memory and no cpus x86, UV: system table in bios accessed after unmap x86: UV BAU messaging timeouts x86: UV BAU and nodes with no memory
2009-04-17x86: fix microcode driver newly spewing warningsDmitry Adamushko
Jeff Garzik reported this WARN_ON() noise: > Kernel: 2.6.30-rc1-00306-g8371f87 > Hardware: ICH10 x86-64 > > This is a regression from 2.6.29. Microcode spews the following WARNING > multiple times during boot: > > ------------[ cut here ]------------ > WARNING: at fs/sysfs/group.c:138 sysfs_remove_group+0xeb/0xf0() > Hardware name: sysfs group ffffffffa0209700 not found for > kobject 'cpu0' Keep sysfs files around for cpus even when we failed to locate microcode for them at the moment of module loading. The appropriate microcode firmware can become available later on. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-17x86, PAT: Remove page granularity tracking for vm_insert_pfn mapsPallipadi, Venkatesh
This change resolves the problem of too many single page entries in pat_memtype_list and "freeing invalid memtype" errors with i915, reported here: http://marc.info/?l=linux-kernel&m=123845244713183&w=2 Remove page level granularity track and untrack of vm_insert_pfn. memtype tracking at page granularity does not scale and cleaner approach would be for the driver to request a type for a bigger IO address range or PCI io memory range for that device, either at mmap time or driver init time and just use that type during vm_insert_pfn. This patch just removes the track/untrack of vm_insert_pfn. That means we will be in same state as 2.6.28, with respect to these APIs. Newer APIs for the drivers to request a memtype for a bigger region is coming soon. [ Impact: fix Xorg startup warnings and hangs ] Reported-by: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com> Tested-by: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> LKML-Reference: <20090408223716.GC3493@linux-os.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-16x86: UV BAU distribution and payload MMRsCliff Wickman
This patch correctly sets BAU memory mapped registers to point to the sending activation descriptor table and target payload table. The "Broadcast Assist Unit" is used for TLB shootdown in UV. The memory mapped registers that point to sending and receiving memory structures contain node numbers. In one case the __pa() function did not provide the node id of memory on blade zero in configurations where that id is nonzero. In another case, it was assumed that memory was allocated on the local node. That assumption is not true in a configuration in which the node has no memory. Tested on the UV hardware simulator. [ Impact: fix possible runtime crash due to incorrect TLB logic ] Signed-off-by: Cliff Wickman <cpw@sgi.com> LKML-Reference: <E1LuR5Z-0007An-B8@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-15x86: disable X86_PTRACE_BTS for nowIngo Molnar
Oleg Nesterov found a couple of races in the ptrace-bts code and fixes are queued up for it but they did not get ready in time for the merge window. We'll merge them in v2.6.31 - until then mark the feature as CONFIG_BROKEN. There's no user-space yet making use of this so it's not a big issue. Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-15acpi-cpufreq: fix 'smp_call_function_many()' confusionLinus Torvalds
It turns out that 'smp_call_function_many()' doesn't work at all like 'smp_call_function_single()', and my change to Andrew's patch to use it rather than a loop over all CPU's acpi-cpufreq doesn't work. My bad. 'smp_call_function_many()' has two "features" (aka "documented bugs"): (a) it needs to be called with preemption disabled, because it uses smp_processor_id() without guarding the CPU lookup with 'get_cpu()' and 'put_cpu()' like the 'single' variant does. (b) even if the current CPU is part of the CPU mask, it won't do the call on that CPU. Still, we're better off trying to use 'smp_call_function_many()' than looping over CPU's, since it at least in theory allows us to use a broadcast IPI and do it all in parallel. So let's just work around the silly semantic bugs in that function. Reported-and-tested-by: Ali Gholami Rudi <ali@rudi.ir> Cc: Ingo Molnar <mingo@elte.hu> Cc: Andrew Morton <akpm@linux-foundation.org>, Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Dave Jones <davej@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-14x86 microcode: revert some work_on_cpuHugh Dickins
Revert part of af5c820a3169e81af869c113e18ec7588836cd50 ("x86: cpumask: use work_on_cpu in arch/x86/kernel/microcode_core.c") That change is causing only one Intel CPU's microcode to be updated e.g. microcode: CPU3 updated from revision 0x9 to 0x17, date = 2005-04-22 where before it announced that also for CPU0 and CPU1 and CPU2. We cannot use work_on_cpu() in the CONFIG_MICROCODE_OLD_INTERFACE code, because Intel's request_microcode_user() involves a copy_from_user() from /sbin/microcode_ctl, which therefore needs to be on that CPU at the time. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-14x86: UV: BAU partition-relative distribution mapCliff Wickman
This patch enables each partition's BAU distribution bit map to be partition-relative. The distribution bitmap had been constructed assuming 0 as the base node number. That construct would not have allowed a total system of greater than 256 nodes. It also corrects an error that occurred when the first blade's nasid was not zero. That nasid was stored as the base node. The base node number gets added by hardware to the node numbers implied in the distribution bitmap, resulting in invalid target nasids. Tested on the UV hardware simulator. Signed-off-by: Cliff Wickman <cpw@sgi.com> LKML-Reference: <E1Ltl0C-0004Ob-37@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-14x86, irq: Remove IRQ_DISABLED check in process context IRQ movePallipadi, Venkatesh
As discussed in the thread here: http://marc.info/?l=linux-kernel&m=123964468521142&w=2 Eric W. Biederman observed: > It looks like some additional bugs have slipped in since last I looked. > > set_irq_affinity does this: > ifdef CONFIG_GENERIC_PENDING_IRQ > if (desc->status & IRQ_MOVE_PCNTXT || desc->status & IRQ_DISABLED) { > cpumask_copy(desc->affinity, cpumask); > desc->chip->set_affinity(irq, cpumask); > } else { > desc->status |= IRQ_MOVE_PENDING; > cpumask_copy(desc->pending_mask, cpumask); > } > #else > > That IRQ_DISABLED case is a software state and as such it has nothing to > do with how safe it is to move an irq in process context. [...] > > The only reason we migrate MSIs in interrupt context today is that there > wasn't infrastructure for support migration both in interrupt context > and outside of it. Yes. The idea here was to force the MSI migration to happen in process context. One of the patches in the series did disable_irq(dev->irq); irq_set_affinity(dev->irq, cpumask_of(dev->cpu)); enable_irq(dev->irq); with the above patch adding irq/manage code check for interrupt disabled and moving the interrupt in process context. IIRC, there was no IRQ_MOVE_PCNTXT when we were developing this HPET code and we ended up having this ugly hack. IRQ_MOVE_PCNTXT was there when we eventually submitted the patch upstream. But, looks like I did a blind rebasing instead of using IRQ_MOVE_PCNTXT in hpet MSI code. Below patch fixes this. i.e., revert commit 932775a4ab622e3c99bd59f14cc and add PCNTXT to HPET MSI setup. Also removes copying of desc->affinity in generic code as set_affinity routines are doing it internally. Reported-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: "Li Shaohua" <shaohua.li@intel.com> Cc: Gary Hade <garyhade@us.ibm.com> Cc: "lcm@us.ibm.com" <lcm@us.ibm.com> Cc: suresh.b.siddha@intel.com LKML-Reference: <20090413222058.GB8211@linux-os.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-13Fix quilt merge error in acpi-cpufreq.cLinus Torvalds
We ended up incorrectly using '&cur' instead of '&readin' in the work_on_cpu() -> smp_call_function_single() transformation in commit 01599fca6758d2cd133e78f87426fc851c9ea725 ("cpufreq: use smp_call_function_[single|many]() in acpi-cpufreq.c"). Andrew explains: "OK, the acpi tree went and had conflicting changes merged into it after I'd written the patch and it appears that I incorrectly reverted part of 18b2646fe3babeb40b34a0c1751e0bf5adfdc64c while fixing the resulting rejects. Switching it to `readin' looks correct." Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-14x86: pci-swiotlb.c swiotlb_dma_ops should be staticJaswinder Singh Rajput
Impact: reduce kernel size a bit, address sparse warning Addresses the problem pointed out by this sparse warning: arch/x86/kernel/pci-swiotlb.c:53:20: warning: symbol 'swiotlb_dma_ops' was not declared. Should it be static? For x86: swiotlb_dma_ops can be static, because it's not used outside of pci-swiotlb.c Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> LKML-Reference: <1239558861.3938.2.camel@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-13Merge branch 'for-rc1/xen/core' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen * 'for-rc1/xen/core' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen: xen: add FIX_TEXT_POKE to fixmap xen: honour VCPU availability on boot xen: clean up gate trap/interrupt constants xen: set _PAGE_NX in __supported_pte_mask before pagetable construction xen: resume interrupts before system devices. xen/mmu: weaken flush_tlb_other test xen/mmu: some early pagetable cleanups Xen: Add virt_to_pfn helper function x86-64: remove PGE from must-have feature list xen: mask XSAVE from cpuid NULL noise: arch/x86/xen/smp.c xen: remove xen_load_gdt debug xen: make xen_load_gdt simpler xen: clean up xen_load_gdt xen: split construction of p2m mfn tables from registration xen: separate p2m allocation from setting xen: disable preempt for leave_lazy_mmu
2009-04-13Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: add linux kernel support for YMM state x86: fix wrong section of pat_disable & make it static x86: Fix section mismatches in mpparse x86: fix set_fixmap to use phys_addr_t x86: Document get_user_pages_fast() x86, intr-remap: fix eoi for interrupt remapping without x2apic
2009-04-13cpufreq: use smp_call_function_[single|many]() in acpi-cpufreq.cAndrew Morton
Atttempting to rid us of the problematic work_on_cpu(). Just use smp_call_fuction_single() here. This repairs a 10% sysbench(oltp)+mysql regression which Mike reported, due to commit 6b44003e5ca66a3fffeb5bc90f40ada2c4340896 Author: Andrew Morton <akpm@linux-foundation.org> Date: Thu Apr 9 09:50:37 2009 -0600 work_on_cpu(): rewrite it to create a kernel thread on demand It seems that the kernel calls these acpi-cpufreq functions at a quite high frequency. Valdis Kletnieks also reports that this causes 70-90 forks per second on his hardware. Cc: Valdis.Kletnieks@vt.edu Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Len Brown <len.brown@intel.com> Cc: Zhao Yakui <yakui.zhao@intel.com> Acked-by: Dave Jones <davej@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Tested-by: Mike Galbraith <efault@gmx.de> Cc: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Ingo Molnar <mingo@elte.hu> [ Made it use smp_call_function_many() instead of looping over cpu's with smp_call_function_single() - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-12x86: add linux kernel support for YMM stateSuresh Siddha
Impact: save/restore Intel-AVX state properly between tasks Intel Advanced Vector Extensions (AVX) introduce 256-bit vector processing capability. More about AVX at http://software.intel.com/sites/avx Add OS support for YMM state management using xsave/xrstor infrastructure to support AVX. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1239402084.27006.8057.camel@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-12x86: fix wrong section of pat_disable & make it staticMarcin Slusarz
pat_disable cannot be __cpuinit anymore because it's called from pat_init and the callchain looks like this: pat_disable [cpuinit] <- pat_init <- generic_set_all <- ipi_handler <- set_mtrr <- (other non init/cpuinit functions) WARNING: arch/x86/mm/built-in.o(.text+0x449e): Section mismatch in reference from the function pat_init() to the function .cpuinit.text:pat_disable() The function pat_init() references the function __cpuinit pat_disable(). This is often because pat_init lacks a __cpuinit annotation or the annotation of pat_disable is wrong. Non CONFIG_X86_PAT version of pat_disable is static inline, so this version can be static too (and there are no callers outside of this file). Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> LKML-Reference: <49DFB055.6070405@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-12x86: Fix section mismatches in mpparseRakib Mullick
Impact: fix section mismatch In arch/x86/kernel/mpparse.c, smp_reserve_bootmem() has been called and also refers to a function which is in .init section. Thus causes the first warning. And check_irq_src() also requires an __init, because it refers to an .init section. Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <b9df5fa10904102004g51265d9axc8d07278bfdb6ba0@mail.gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10x86: fix set_fixmap to use phys_addr_tMasami Hiramatsu
Impact: fix kprobes crash on 32-bit with RAM above 4G Use phys_addr_t for receiving a physical address argument instead of unsigned long. This allows fixmap to handle pages higher than 4GB on x86-32. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: systemtap-ml <systemtap@sources.redhat.com> Cc: Gary Hade <garyhade@us.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <49DE3695.6040800@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10x86, PAT: Remove duplicate memtype reserve in devmem mmapSuresh Siddha
/dev/mem mmap code was doing memtype reserve/free for a while now. Recently we added memtype tracking in remap_pfn_range, and /dev/mem mmap uses it indirectly. So, we don't need seperate tracking in /dev/mem code any more. That means another ~100 lines of code removed :-). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> LKML-Reference: <20090409212709.085210000@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10x86, PAT: Consolidate code in pat_x_mtrr_type() and reserve_memtype()Suresh Siddha
Fix pat_x_mtrr_type() to use UC_MINUS when the mtrr type return UC. This is to be consistent with ioremap() and ioremap_nocache() which uses UC_MINUS. Consolidate the code such that reserve_memtype() also uses pat_x_mtrr_type() when the caller doesn't specify any special attribute (non WB attribute). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> LKML-Reference: <20090409212708.939936000@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10x86, PAT: Changing memtype to WC ensuring no WB aliasvenkatesh.pallipadi@intel.com
As per SDM, there should not be any aliasing of a WC with any cacheable type across CPUs. That is if one CPU is changing the identity map memtype to _WC, no other CPU at the time of this change should not have a TLB for this page that carries a WB attribute. SDM suggests to make the page not present. But for that we will have to handle any page faults that can potentially happen due to these pages being not present. Other way to deal with this without having any WB mapping is to change the page first to UC and then to WC. This ensures that we meet the SDM requirement of no cacheable alais to WC page. This also has same or lower overhead than marking the page not present and making it present later. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20090409212708.797481000@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10x86, PAT: Handle faults cleanly in set_memory_ APIsvenkatesh.pallipadi@intel.com
Handle faults and do proper cleanups in set_memory_*() functions. In some cases, these functions were not doing proper free on failure paths. With the changes to tracking memtype of RAM pages in struct page instead of pat list, we do not need the changes in commits c5e147. This patch reverts that change. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20090409212708.653222000@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10x86, PAT: Change order of cpa and free in set_memory_wbvenkatesh.pallipadi@intel.com
To be free of aliasing due to races, set_memory_* interfaces should follow ordering of reserving, changing memtype to UC/WC, changing memtype back to WB followed by free. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20090409212708.512280000@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10x86, CPA: Change idmap attribute before ioremap attribute setupSuresh Siddha
Change the identity mapping with the requested attribute first, before we setup the virtual memory mapping with the new requested attribute. This makes sure that there is no window when identity map'ed attribute may disagree with ioremap range on the attribute type. This also avoids doing cpa on the ioremap'ed address twice (first in ioremap_page_range and then in ioremap_change_attr using vaddr), and should improve ioremap performance a bit. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> LKML-Reference: <20090409212708.373330000@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10x86: Document get_user_pages_fast()Andy Grover
While better than get_user_pages(), the usage of gupf(), especially the return values and the fact that it can potentially only partially pin the range, warranted some documentation. Signed-off-by: Andy Grover <andy.grover@oracle.com> Cc: npiggin@suse.de Cc: akpm@linux-foundation.org LKML-Reference: <1239320729-3262-1-git-send-email-andy.grover@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10x86, intr-remap: fix eoi for interrupt remapping without x2apicWeidong Han
To simplify level irq migration in the presence of interrupt-remapping, Suresh used a virtual vector (io-apic pin number) to eliminate io-apic RTE modification. Level triggered interrupt will appear as an edge to the local apic cpu but still as level to the IO-APIC. So in addition to do the local apic EOI, it still needs to do IO-APIC directed EOI to clear the remote IRR bit in the IO-APIC RTE. Pls refer to Suresh's patch for more details (commit 0280f7c416c652a2fd95d166f52b199ae61122c0). Now interrupt remapping is decoupled from x2apic, it also needs to do the directed EOI for apic. Otherwise, apic interrupts won't work correctly. Signed-off-by: Weidong Han <weidong.han@intel.com> Cc: iommu@lists.linux-foundation.org Cc: Weidong Han <weidong.han@intel.com> Cc: suresh.b.siddha@intel.com Cc: dwmw2@infradead.org Cc: allen.m.kay@intel.com LKML-Reference: <1239355037-22856-1-git-send-email-weidong.han@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-09x86: fix set_fixmap to use phys_addr_tMasami Hiramatsu
Use phys_addr_t for receiving a physical address argument instead of unsigned long. This allows fixmap to handle pages higher than 4GB on x86-32. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-09Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: cpu_debug remove execute permission x86: smarten /proc/interrupts output for new counters x86: DMI match for the Dell DXP061 as it needs BIOS reboot x86: make 64 bit to use default_inquire_remote_apic x86, setup: un-resequence mode setting for VGA 80x34 and 80x60 modes x86, intel-iommu: fix X2APIC && !ACPI build failure
2009-04-09Merge branch 'tracing-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tracing: consolidate documents blktrace: pass the right pointer to kfree() tracing/syscalls: use a dedicated file header tracing: append a comma to INIT_FTRACE_GRAPH
2009-04-09x86: cpu_debug remove execute permissionJaswinder Singh Rajput
It seems by mistake these files got execute permissions so removing it. Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> LKML-Reference: <1239211186.9037.2.camel@ht.satnam> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-09tracing/syscalls: use a dedicated file headerFrederic Weisbecker
Impact: fix build warnings and possibe compat misbehavior on IA64 Building a kernel on ia64 might trigger these ugly build warnings: CC arch/ia64/ia32/sys_ia32.o In file included from arch/ia64/ia32/sys_ia32.c:55: arch/ia64/ia32/ia32priv.h:290:1: warning: "elf_check_arch" redefined In file included from include/linux/elf.h:7, from include/linux/module.h:14, from include/linux/ftrace.h:8, from include/linux/syscalls.h:68, from arch/ia64/ia32/sys_ia32.c:18: arch/ia64/include/asm/elf.h:19:1: warning: this is the location of the previous definition [...] sys_ia32.c includes linux/syscalls.h which in turn includes linux/ftrace.h to import the syscalls tracing prototypes. But including ftrace.h can pull too much things for a low level file, especially on ia64 where the ia32 private headers conflict with higher level headers. Now we isolate the syscall tracing headers in their own lightweight file. Reported-by: Tony Luck <tony.luck@intel.com> Tested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Jason Baron <jbaron@redhat.com> Cc: "Frank Ch. Eigler" <fche@redhat.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Jiaying Zhang <jiayingz@google.com> Cc: Michael Rubin <mrubin@google.com> Cc: Martin Bligh <mbligh@google.com> Cc: Michael Davidson <md@google.com> LKML-Reference: <20090408184058.GB6017@nowhere> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-08xen: add FIX_TEXT_POKE to fixmapJeremy Fitzhardinge
FIX_TEXT_POKE[01] are used to map kernel addresses, so they're mapping pfns, not mfns. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>