Kernel - My Linux kernel repository

Age	Commit message (Collapse)	Author
2009-05-26	sh: pci-sh7751: Initialize io_map_base in controller definition.	Paul Mundt
	As there is only a single controller and remapping has no impact for the address range in question, just initialize it directly in the controller definition. This fixes up boot time warnings about not having the field initialized. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-05-26	sh: Add a KBUILD_DEFCONFIG for sh64.	Paul Mundt
	Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-05-26	sh: Record ms7724se in mach-types.	Paul Mundt
	Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-05-26	sh: Add ms7724se (SH7724) board support	Kuninori Morimoto
	This adds preliminary support for the ms7724se solution engine board. Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-05-26	sh: irq: Fix up imask build warnings.	Paul Mundt
	Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-05-22	sh: Prefer slab_is_available() over after_bootmem.	Paul Mundt
	This kills off after_bootmem and switches to using slab_is_available() instead. Presently the only place this is used is by the sh64 ioremap, and there's not much point in keeping the reference around otherwise. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-05-22	sh: Add a NR_IRQS_LEGACY for external IRQ0-7.	Paul Mundt
	This adds a NR_IRQS_LEGACY definition, which will be used by sparse irq. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-05-22	sh: Wrap irq_to_desc_alloc_cpu() around CONFIG_SPARSE_IRQ temporarily.	Paul Mundt
	irq_to_desc_alloc_cpu() has been renamed to irq_to_desc_alloc_node() in -next, but as we can not presently enable SPARSE_IRQ without the early irq_desc alloc patch, protect it with an ifdef until the interface has settled and we are ready to enable it system-wide. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-05-22	Merge branches 'sh/stable-updates' and 'sh/sparseirq'	Paul Mundt

2009-05-22	sh: ap325 camera without i2c driver fix	Magnus Damm
	This patch fixes the ap325rxa ncm03j camera code to handle the case where no i2c driver is present. Without this fix i2c_transfer() may be passed NULL as adapter which results in a crash. Triggered when i2c-sh_mobile.c failed to probe() due to missing MSTP clocks. Signed-off-by: Magnus Damm <damm@igel.co.jp> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-05-22	sh: irq: Provide an arch_probe_nr_irqs() that wraps the machvec def.	Paul Mundt
	This is just a simple arch_probe_nr_irqs() stub that wraps to the platform defined number of IRQs. This can be made gradually more intelligent based on what we can infer from the INTC tables and so on. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-05-22	sh: irq: Teach ipr and intc about dynamically allocating irq_descs.	Paul Mundt
	This hooks in irq_to_desc_alloc_cpu() to the necessary code paths in the intc and ipr controller registration paths. As these are the primary call paths for all SH CPUs, this alone will make all CPUs sparse IRQ ready. There is the added benefit now that each CPU contains specific IPR and INTC tables, so only the vectors with interrupt sources backing them will ever see an irq_desc instantiation. This effectively packs irq_desc down to match the CPU, rather than padding NR_IRQS out to cover the valid vector range. Boards with extra sources will still have to fiddle with the nr_irqs setting, but they can continue doing so through the machvec as before. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-05-22	sh: irq: Convert from irq_desc[] to irq_to_desc().	Paul Mundt
	This converts a few places that were using the old irq_desc[] array over to the shiny new irq_to_desc() helper. Preperatory work for sparse irq support. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-05-22	sh: irq: Rework the SR.IMASK bitmap handling.	Paul Mundt
	This tidies up how the SR.IMASK bitmap is managed, using the bitmap API directly instead. At the same time, tidy up the irq_chip conversion a bit. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-05-20	Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus	Linus Torvalds
	* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: MIPS: 64-bit: Fix system lockup. MIPS: IP28: Change to build with -mr10k-cache-barrier=store MIPS: IP22: Fix hang in power button interrupt handler MIPS: IP32: Fix hang on shutdown in power button interrupt handler.
2009-05-20	Merge master.kernel.org:/home/rmk/linux-2.6-arm	Linus Torvalds
	* master.kernel.org:/home/rmk/linux-2.6-arm: (25 commits) [ARM] 5519/1: amba probe: pass "struct amba_id " instead of void [ARM] 5517/1: integrator: don't put clock lookups in __initdata [ARM] 5518/1: versatile: don't put clock lookups in __initdata [ARM] mach-l7200: fix spelling of SYS_CLOCK_OFF [ARM] Double check memmap is actually valid with a memmap has unexpected holes V2 [ARM] realview: fix broadcast tick support [ARM] realview: remove useless smp_cross_call_done() [ARM] smp: fix cpumask usage in ARM SMP code [ARM] 5513/1: Eurotech VIPER SBC: fix compilation error [ARM] 5509/1: ep93xx: clkdev enable UARTS ARM: OMAP2/3: Change omapfb to use clkdev for dispc and rfbi, v2 ARM: OMAP3: Fix HW SAVEANDRESTORE shift define ARM: OMAP3: Fix number of GPIO lines for 34xx [ARM] S3C: Do not set clk->owner field if unset [ARM] S3C2410: mach-bast.c registering i2c data too early [ARM] S3C24XX: Fix unused code warning in arch/arm/plat-s3c24xx/dma.c [ARM] S3C64XX: fix GPIO debug [ARM] S3C64XX: GPIO include cleanup [ARM] nwfpe: fix 'floatx80_is_nan' sparse warning [ARM] nwfpe: Add decleration for ExtendedCPDO ...
2009-05-20	MIPS: 64-bit: Fix system lockup.	Greg Ungerer
	The address range size calculation inside local_flush_tlb_kernel_range() is being truncated by a too small size variable holder on 64-bit systems. The truncated size can result in an erroneous tlbsize check that means we sit spinning inside a loop trying to flush a hige number of TLB entries. This is for all intents and purposes a system hang. Fix by using an appropriately sized valiable to hold the size. [Ralf: Greg's original patch submission identified the issue and fixed one instance in tlb-r4k.c but there there were several more. For consistency I also modified tlb-r3k.c even though that file is only used on 32-bit.] Signed-off-by: Greg Ungerer <gerg@snapgear.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2009-05-20	MIPS: IP28: Change to build with -mr10k-cache-barrier=store	peter fuerst
	Richard Sandiford's new code for inserting the cache-barriers, for GCC 4.3 and above and already incorporated in the current GCC-release, uses a slightly different option-syntax. Signed-off-by: peter fuerst <post@pfrst.de> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2009-05-20	MIPS: IP22: Fix hang in power button interrupt handler	Ralf Baechle
	The hang was caused by the use of disable_irq() from the interrupt handler itself. Fixed by the use of disable_irq_nosync(). The issue was triggered by: commit 3aa551c9b4c40018f0e261a178e3d25478dc04a9 Author: Thomas Gleixner <tglx@linutronix.de> Date: Mon Mar 23 18:28:15 2009 +0100 genirq: add threaded interrupt handler support Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2009-05-20	MIPS: IP32: Fix hang on shutdown in power button interrupt handler.	Andrew Randrianasulu
	The hang was caused by the use of disable_irq() from the interrupt handler itself. Fixed by the use of disable_irq_nosync(). The issue was triggered by: commit 3aa551c9b4c40018f0e261a178e3d25478dc04a9 Author: Thomas Gleixner <tglx@linutronix.de> Date: Mon Mar 23 18:28:15 2009 +0100 genirq: add threaded interrupt handler support Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2009-05-20	sh: mach-hp6xx: Fix up the hp6xx build for hd64461 changes.	Paul Mundt
	Fixes several compile errors due to the recent hd64461 I/O base changes. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-05-19	Merge branch 'fixes-for-linus' of git://git.monstr.eu/linux-2.6-microblaze	Linus Torvalds
	* 'fixes-for-linus' of git://git.monstr.eu/linux-2.6-microblaze: microblaze: Fix kind-of-intr checking against number of interrupts microblaze: Update Microblaze defconfig
2009-05-18	Merge branch 'merge' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc: Explicit alignment for .data.cacheline_aligned powerpc/ps3: Update ps3_defconfig powerpc/ftrace: Fix constraint to be early clobber powerpc/ftrace: Use pr_devel() in ftrace.c powerpc: Do not assert pte_locked for hugepage PTE entries
2009-05-18	[ARM] 5517/1: integrator: don't put clock lookups in __initdata	Rabin Vincent
	Remove the __initdata annotation for the clock lookups, since they will be needed when loading modules which use clk_get(). Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-05-18	[ARM] 5518/1: versatile: don't put clock lookups in __initdata	Rabin Vincent
	Remove the __initdata annotation for the clock lookups, since they will be needed when loading modules which use clk_get(). Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-05-18	Merge branch 'x86-fixes-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Fix performance regression caused by paravirt_ops on native kernels xen: use header for EXPORT_SYMBOL_GPL x86, 32-bit: fix kernel_trap_sp() x86: fix percpu_{to,from}_op() x86: mtrr: Fix high_width computation when phys-addr is >= 44bit x86: Fix false positive section mismatch warnings in the apic code
2009-05-18	Merge branch 'tracing-fixes-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tracing: Append prompt in /debug/tracing/README file x86/function-graph: fix constraint for recording old return value
2009-05-18	microblaze: Fix kind-of-intr checking against number of interrupts	Michal Simek
	+ Fix typographic fault. Signed-off-by: Michal Simek <monstr@monstr.eu>
2009-05-18	microblaze: Update Microblaze defconfig	Michal Simek
	Signed-off-by: Michal Simek <monstr@monstr.eu>
2009-05-18	[ARM] mach-l7200: fix spelling of SYS_CLOCK_OFF	Pavel Roskin
	Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-05-18	[ARM] Double check memmap is actually valid with a memmap has unexpected ↵	Mel Gorman
	holes V2 pfn_valid() is meant to be able to tell if a given PFN has valid memmap associated with it or not. In FLATMEM, it is expected that holes always have valid memmap as long as there is valid PFNs either side of the hole. In SPARSEMEM, it is assumed that a valid section has a memmap for the entire section. However, ARM and maybe other embedded architectures in the future free memmap backing holes to save memory on the assumption the memmap is never used. The page_zone linkages are then broken even though pfn_valid() returns true. A walker of the full memmap must then do this additional check to ensure the memmap they are looking at is sane by making sure the zone and PFN linkages are still valid. This is expensive, but walkers of the full memmap are extremely rare. This was caught before for FLATMEM and hacked around but it hits again for SPARSEMEM because the page_zone linkages can look ok where the PFN linkages are totally screwed. This looks like a hatchet job but the reality is that any clean solution would end up consumning all the memory saved by punching these unexpected holes in the memmap. For example, we tried marking the memmap within the section invalid but the section size exceeds the size of the hole in most cases so pfn_valid() starts returning false where valid memmap exists. Shrinking the size of the section would increase memory consumption offsetting the gains. This patch identifies when an architecture is punching unexpected holes in the memmap that the memory model cannot automatically detect and sets ARCH_HAS_HOLES_MEMORYMODEL. At the moment, this is restricted to EP93xx which is the model sub-architecture this has been reported on but may expand later. When set, walkers of the full memmap must call memmap_valid_within() for each PFN and passing in what it expects the page and zone to be for that PFN. If it finds the linkages to be broken, it assumes the memmap is invalid for that PFN. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-05-18	powerpc: Explicit alignment for .data.cacheline_aligned	Benjamin Herrenschmidt
	I don't think anything guarantees that the objects in data.page_aligned are a multiple of PAGE_SIZE, thus the section may end on any boundary. So the following section, .data.cacheline_aligned needs an explicit alignment. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-18	powerpc/ps3: Update ps3_defconfig	Geoff Levand
	Refresh and set these options: CONFIG_SYSFS_DEPRECATED_V2: y -> n CONFIG_INPUT_JOYSTICK: y -> n CONFIG_HID_SONY: n -> m CONFIG_RTC_DRV_PS3: - -> m Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-18	powerpc/ftrace: Fix constraint to be early clobber	Steven Rostedt
	After upgrading my distcc boxes from gcc 4.2.2 to 4.4.0, the function graph tracer broke. This was discovered on my x86 boxes. The issue is that gcc used the same register for an output as it did for an input in an asm statement. I first thought this was a bug in gcc and reported it. I was notified that gcc was correct and that the output had to be flagged as an "early clobber". I noticed that powerpc had the same issue and this patch fixes it. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-18	powerpc/ftrace: Use pr_devel() in ftrace.c	Michael Ellerman
	pr_debug() can now result in code being generated even when #DEBUG is not defined. That's not really desirable in the ftrace code which we want to be snappy. With CONFIG_DYNAMIC_DEBUG=y: size before: text data bss dec hex filename 3334 672 4 4010 faa arch/powerpc/kernel/ftrace.o size after: text data bss dec hex filename 2616 360 4 2980 ba4 arch/powerpc/kernel/ftrace.o Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-18	powerpc: Do not assert pte_locked for hugepage PTE entries	Mel Gorman
	With CONFIG_DEBUG_VM, an assertion is made when changing the protection flags of a PTE that the PTE is locked. Huge pages use a different pagetable format and the assertion is bogus and will always trigger with a bug looking something like Unable to handle kernel paging request for data at address 0xf1a00235800006f8 Faulting instruction address: 0xc000000000034a80 Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=32 NUMA Maple Modules linked in: dm_snapshot dm_mirror dm_region_hash dm_log dm_mod loop evdev ext3 jbd mbcache sg sd_mod ide_pci_generic pata_amd ata_generic ipr libata tg3 libphy scsi_mod windfarm_pid windfarm_smu_sat windfarm_max6690_sensor windfarm_lm75_sensor windfarm_cpufreq_clamp windfarm_core i2c_powermac NIP: c000000000034a80 LR: c000000000034b18 CTR: 0000000000000003 REGS: c000000003037600 TRAP: 0300 Not tainted (2.6.30-rc3-autokern1) MSR: 9000000000009032 <EE,ME,IR,DR> CR: 28002484 XER: 200fffff DAR: f1a00235800006f8, DSISR: 0000000040010000 TASK = c0000002e54cc740[2960] 'map_high_trunca' THREAD: c000000003034000 CPU: 2 GPR00: 4000000000000000 c000000003037880 c000000000895d30 c0000002e5a2e500 GPR04: 00000000a0000000 c0000002edc40880 0000005700000393 0000000000000001 GPR08: f000000011ac0000 01a00235800006e8 00000000000000f5 f1a00235800006e8 GPR12: 0000000028000484 c0000000008dd780 0000000000001000 0000000000000000 GPR16: fffffffffffff000 0000000000000000 00000000a0000000 c000000003037a20 GPR20: c0000002e5f4ece8 0000000000001000 c0000002edc40880 0000000000000000 GPR24: c0000002e5f4ece8 0000000000000000 00000000a0000000 c0000002e5f4ece8 GPR28: 0000005700000393 c0000002e5a2e500 00000000a0000000 c000000003037880 NIP [c000000000034a80] .assert_pte_locked+0xa4/0xd0 LR [c000000000034b18] .ptep_set_access_flags+0x6c/0xb4 Call Trace: [c000000003037880] [c000000003037990] 0xc000000003037990 (unreliable) [c000000003037910] [c000000000034b18] .ptep_set_access_flags+0x6c/0xb4 [c0000000030379b0] [c00000000014bef8] .hugetlb_cow+0x124/0x674 [c000000003037b00] [c00000000014c930] .hugetlb_fault+0x4e8/0x6f8 [c000000003037c00] [c00000000013443c] .handle_mm_fault+0xac/0x828 [c000000003037cf0] [c0000000000340a8] .do_page_fault+0x39c/0x584 [c000000003037e30] [c0000000000057b0] handle_page_fault+0x20/0x5c Instruction dump: 7d29582a 7d200074 7800d182 0b000000 3c004000 3960ffff 780007c6 796b00c4 7d290214 7929a302 1d290068 7d6b4a14 <800b0010> 7c000074 7800d182 0b000000 This patch fixes the problem by not asseting the PTE is locked for VMAs backed by huge pages. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-17	Merge branch 'smp-fix'	Russell King

2009-05-17	[ARM] realview: fix broadcast tick support	Russell King
	Having discussed broadcast tick support with Thomas Glexiner, the broadcast tick devices should be registered with a higher rating than the global tick device, and it should have the ONESHOT and PERIODIC feature flags set. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Acked-by: Thomas Glexiner <tglx@linutronix.de>
2009-05-17	[ARM] realview: remove useless smp_cross_call_done()	Russell King
	smp_cross_call_done() is a no-op for MPCore, and since it's only used by platform code, there's no point in having it unless it's doing something. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-05-17	[ARM] smp: fix cpumask usage in ARM SMP code	Russell King
	The ARM SMP code wasn't properly updated for the cpumask changes, which results in smp_timer_broadcast() broadcasting ticks to non-online CPUs. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-05-16	[ARM] 5513/1: Eurotech VIPER SBC: fix compilation error	Ricardo Martins
	Compilation for this board yields the following errors: arch/arm/mach-pxa/viper.c:511: error: 'FFUART' undeclared here (not in a function) arch/arm/mach-pxa/viper.c:520: error: 'BTUART' undeclared here (not in a function) arch/arm/mach-pxa/viper.c:529: error: 'STUART' undeclared here (not in a function) Fix them by including the necessary header. Signed-off-by: Ricardo Martins <rasm@fe.up.pt> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-05-16	[ARM] 5509/1: ep93xx: clkdev enable UARTS	Hartley Sweeten
	Fix the clkdev API support for the ep93xx uart clocks. The uarts available in the ep93xx have individual clock controls. The current implementation assumes that the bootloader has enabled the clocks before the kernel has booted. It also assumes that the bootloader has set the UARTBAUD bit indicating that the uarts are running off the 14.7456MHz external crystal. This fixes both issues. It also allows the uart clocks to be stopped when there are no users. Tested-by: Matthias Kaehlcke <matthias@kaehlcke.net> Cc: Ryan Mallon <ryan@bluewatersys.com> Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-05-16	Merge branch 'omap-fixes' of ↵	Russell King
	git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6
2009-05-16	Merge branch 'fixes-rc5' of git://aeryn.fluff.org.uk/bjdooks/linux	Russell King

2009-05-16	ARM: OMAP2/3: Change omapfb to use clkdev for dispc and rfbi, v2	Tony Lindgren
	This makes the framebuffer work on omap3. Also fix the clk_get usage for checkpatch.pl "ERROR: do not use assignment in if condition". Cc: Imre Deak <imre.deak@nokia.com> Cc: linux-fbdev-devel@lists.sourceforge.net Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl> Signed-off-by: Tony Lindgren <tony@atomide.com>
2009-05-16	ARM: OMAP3: Fix HW SAVEANDRESTORE shift define	Kalle Jokiniemi
	The OMAP3430ES2_SAVEANDRESTORE_SHIFT macro is used by powerdomain code in "1 << OMAP3430ES2_SAVEANDRESTORE_SHIFT" manner, but the definition was also (1 << 4), meaning we actually modified bit 16. So the definition needs to be 4. This fixes also a cold reset HW bug in OMAP3430 ES3.x where some of the efuse bits are not isolated during wake-up from off mode. This can cause randomish cold resets with off mode. Enabling the USBTLL hardware SAVEANDRESTORE causes the core power up assert to be delayed in a way that we will not get faulty values when boot ROM is reading the unisolated registers. Signed-off-by: Kalle Jokiniemi <kalle.jokiniemi@digia.com> Acked-by: Kevin Hilman <khilman@deeprootsystems.com> Acked-by: Paul Walmsley <paul@pwsan.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
2009-05-16	ARM: OMAP3: Fix number of GPIO lines for 34xx	Vikram Pandita
	As per 3430 TRM, there are 6 banks [0 to 191] Signed-off-by: Tom Rix <Tom.Rix@windriver.com> Signed-off-by: Vikram Pandita <vikram.pandita@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
2009-05-15	x86: Fix performance regression caused by paravirt_ops on native kernels	Jeremy Fitzhardinge
	Xiaohui Xin and some other folks at Intel have been looking into what's behind the performance hit of paravirt_ops when running native. It appears that the hit is entirely due to the paravirtualized spinlocks introduced by: \| commit 8efcbab674de2bee45a2e4cdf97de16b8e609ac8 \| Date: Mon Jul 7 12:07:51 2008 -0700 \| \| paravirt: introduce a "lock-byte" spinlock implementation The extra call/return in the spinlock path is somehow causing an increase in the cycles/instruction of somewhere around 2-7% (seems to vary quite a lot from test to test). The working theory is that the CPU's pipeline is getting upset about the call->call->locked-op->return->return, and seems to be failing to speculate (though I haven't seen anything definitive about the precise reasons). This doesn't entirely make sense, because the performance hit is also visible on unlock and other operations which don't involve locked instructions. But spinlock operations clearly swamp all the other pvops operations, even though I can't imagine that they're nearly as common (there's only a .05% increase in instructions executed). If I disable just the pv-spinlock calls, my tests show that pvops is identical to non-pvops performance on native (my measurements show that it is actually about .1% faster, but Xiaohui shows a .05% slowdown). Summary of results, averaging 10 runs of the "mmperf" test, using a no-pvops build as baseline: nopv Pv-nospin Pv-spin CPU cycles 100.00% 99.89% 102.18% instructions 100.00% 100.10% 100.15% CPI 100.00% 99.79% 102.03% cache ref 100.00% 100.84% 100.28% cache miss 100.00% 90.47% 88.56% cache miss rate 100.00% 89.72% 88.31% branches 100.00% 99.93% 100.04% branch miss 100.00% 103.66% 107.72% branch miss rt 100.00% 103.73% 107.67% wallclock 100.00% 99.90% 102.20% The clear effect here is that the 2% increase in CPI is directly reflected in the final wallclock time. (The other interesting effect is that the more ops are out of line calls via pvops, the lower the cache access and miss rates. Not too surprising, but it suggests that the non-pvops kernel is over-inlined. On the flipside, the branch misses go up correspondingly...) So, what's the fix? Paravirt patching turns all the pvops calls into direct calls, so _spin_lock etc do end up having direct calls. For example, the compiler generated code for paravirtualized _spin_lock is: <_spin_lock+0>: mov %gs:0xb4c8,%rax <_spin_lock+9>: incl 0xffffffffffffe044(%rax) <_spin_lock+15>: callq 0xffffffff805a5b30 <_spin_lock+22>: retq The indirect call will get patched to: <_spin_lock+0>: mov %gs:0xb4c8,%rax <_spin_lock+9>: incl 0xffffffffffffe044(%rax) <_spin_lock+15>: callq <__ticket_spin_lock> <_spin_lock+20>: nop; nop / or whatever 2-byte nop */ <_spin_lock+22>: retq One possibility is to inline _spin_lock, etc, when building an optimised kernel (ie, when there's no spinlock/preempt instrumentation/debugging enabled). That will remove the outer call/return pair, returning the instruction stream to a single call/return, which will presumably execute the same as the non-pvops case. The downsides arel 1) it will replicate the preempt_disable/enable code at eack lock/unlock callsite; this code is fairly small, but not nothing; and 2) the spinlock definitions are already a very heavily tangled mass of #ifdefs and other preprocessor magic, and making any changes will be non-trivial. The other obvious answer is to disable pv-spinlocks. Making them a separate config option is fairly easy, and it would be trivial to enable them only when Xen is enabled (as the only non-default user). But it doesn't really address the common case of a distro build which is going to have Xen support enabled, and leaves the open question of whether the native performance cost of pv-spinlocks is worth the performance improvement on a loaded Xen system (10% saving of overall system CPU when guests block rather than spin). Still it is a reasonable short-term workaround. [ Impact: fix pvops performance regression when running native ] Analysed-by: "Xin Xiaohui" <xiaohui.xin@intel.com> Analysed-by: "Li Xin" <xin.li@intel.com> Analysed-by: "Nakajima Jun" <jun.nakajima@intel.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Xen-devel <xen-devel@lists.xensource.com> LKML-Reference: <4A0B62F7.5030802@goop.org> [ fixed the help text ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-15	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: ASoC: DaVinci EVM board support buildfixes ASoC: DaVinci I2S updates ASoC: davinci-pcm buildfixes ALSA: pcsp: fix printk format warning ALSA: riptide: postfix increment and off by one pxa2xx-ac97: fix reset gpio mode setting ASoC: soc-core: fix crash when removing not instantiated card
2009-05-15	Merge branch 'for_linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb: kgdb: gdb documentation fix kgdb,i386: use address that SP register points to in the exception frame sysrq, intel_fb: fix sysrq g collision