aboutsummaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)Author
2008-02-13x86: pit_clockevent can be staticHarvey Harrison
arch/x86/kernel/i8253.c:98:27: warning: symbol 'pit_clockevent' was not declared. Should it be static? Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-13x86: EFI runtime code mapping enhancementHuang, Ying
This patch enhances EFI runtime code memory mapping as following: - Move __supported_pte_mask & _PAGE_NX checking before invoking runtime_code_page_mkexec(). This makes it possible for compiler to eliminate runtime_code_page_mkexec() on machine without NX support. - Use set_memory_x/nx in early_mapping_set_exec(). This eliminates the duplicated implementation. This patch has been tested on Intel x86_64 platform with EFI64/32 firmware. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-13x86: EFI: fix use of unitialized variable and the cache logicThomas Gleixner
Andi Kleen pointed out that the cache attribute logic is reverse in efi_enter_virtual_mode(). This problem alone is harmless as we do not (yet) do cache attribute conflict resolution. (This bug was not present in the original EFI submission - I introduced it while fixing up rejects.) While reviewing this code I noticed a second, worse problem: the use of uninitialized md->virt_addr. Fix both problems. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-13x86: CPA: fix gbpages support in try_preserve_large_pageAndi Kleen
[ mingo@elte.hu: while gbpages cannot be enabled on mainline currently, keep the code uptodate and this fix is easy enough. ] Use correct page sizes and masks for GB pages in try_preserve_large_page() This prevents a boot hang on a GB capable system with CONFIG_DIRECT_GBPAGES enabled. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-13xen: unpin initial Xen pagetable once we're finished with itJeremy Fitzhardinge
Unpin the Xen-provided pagetable once we've finished with it, so it doesn't cause stray references which cause later swapper_pg_dir pagetable updates to fail. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Tested-by: Jody Belka <knew-linux@pimb.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-13x86/early_ioremap: don't assume we're using swapper_pg_dirJeremy Fitzhardinge
At the early stages of boot, before the kernel pagetable has been fully initialized, a Xen kernel will still be running off the Xen-provided pagetables rather than swapper_pg_dir[]. Therefore, readback cr3 to determine the base of the pagetable rather than assuming swapper_pg_dir[]. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Tested-by: Jody Belka <knew-linux@pimb.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-13x86: fixup machine_ops reboot_{32|64}.c unification falloutJody Belka
When reboot_32.c and reboot_64.c were unified (commit 4d022e35fd...), the machine_ops code was broken, leading to xen pvops kernels failing to properly halt/poweroff/reboot etc. This fixes that up. Signed-off-by: Jody Belka <knew-linux@pimb.org> Cc: Miguel Boton <mboton@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-11x86: vdso_install fixRoland McGrath
The makefile magic for installing the 32-bit vdso images on disk had a little error. A single-line change would fix that bug, but this does a little more to reduce the error-prone duplication of this bit of makefile variable magic. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-11x86: remove over noisy debug printkThomas Gleixner
pageattr-test.c contains a noisy debug printk that people reported. The condition under which it prints (randomly tapping into a mem_map[] hole and not being able to c_p_a() there) is valid behavior and not interesting to report. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-11Use proper abstractions in quirk_intel_irqbalanceMatthew Wilcox
Since we may not have a pci_dev for the device we need to access, we can't use pci_read_config_word. But raw_pci_read is an internal implementation detail; it's better to use the architected pci_bus_read_config_word interface. Using PCI_DEVFN instead of a mysterious constant helps reassure everyone that we really do intend to access device 8. [ Thanks to Grant Grundler for pointing out to me that this is exactly what the write immediately above this is doing -- enabling device 8 to respond to config space cycles. - Matthew Grant also says: "Can you also add a comment which points at the Intel documentation? The 'Intel E7320 Memory Controller Hub (MCH) Datasheet' at http://download.intel.com/design/chipsets/datashts/30300702.pdf Page 69 documents register F4h (DEVPRES1). And I just doubled checked that the 0xf4 register value is restored later in the quirk (obvious when you look at the code but not from the patch" so here it is. - Linus ] Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Acked-by: Grant Grundler <grundler@parisc-linux.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-10Change pci_raw_ops to pci_raw_read/writeMatthew Wilcox
We want to allow different implementations of pci_raw_ops for standard and extended config space on x86. Rather than clutter generic code with knowledge of this, we make pci_raw_ops private to x86 and use it to implement the new raw interface -- raw_pci_read() and raw_pci_write(). Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-10PCI x86: always use conf1 to access config space below 256 bytesIvan Kokshaysky
Thanks to Loic Prylli <loic@myri.com>, who originally proposed this idea. Always using legacy configuration mechanism for the legacy config space and extended mechanism (mmconf) for the extended config space is a simple and very logical approach. It's supposed to resolve all known mmconf problems. It still allows per-device quirks (tweaking dev->cfg_size). It also allows to get rid of mmconf fallback code. Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86: (32 commits) x86: cpa, strict range check in try_preserve_large_page() x86: cpa, enable CONFIG_DEBUG_PAGEALLOC on 64-bit x86: cpa, use page pool x86: introduce page pool in cpa x86: DEBUG_PAGEALLOC: enable after mem_init() brk: help text typo fix lguest: accept guest _PAGE_PWT page table entries x86 PM: update stale comments x86 PM: consolidate suspend and hibernation code x86 PM: rename 32-bit files in arch/x86/power x86 PM: move 64-bit hibernation files to arch/x86/power x86: trivial printk optimizations x86: fix early_ioremap pagetable ops x86: construct 32-bit boot time page tables in native format. x86, core: remove CONFIG_FORCED_INLINING x86: avoid unused variable warning in mm/init_64.c x86: fixup more paravirt fallout brk: document randomize_va_space and CONFIG_COMPAT_BRK (was Re: x86: fix sparse warnings in acpi/bus.c x86: fix sparse warning in topology.c ...
2008-02-09Update arch/x86/boot/.gitignore with new auto-generated filesS.Çağlar Onur
Signed-off-by: S.Çağlar Onur <caglar@pardus.org.tr> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-09x86: cpa, strict range check in try_preserve_large_page()Thomas Gleixner
Right now, we check only the first 4k page for static required protections. This does not take overlapping regions into account. So we might end up setting the wrong permissions/protections for other parts of this large page. This can be optimized further, but correctness is the important part. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-09x86: cpa, enable CONFIG_DEBUG_PAGEALLOC on 64-bitThomas Gleixner
Now, that the page pool is in place we can enable DEBUG_PAGEALLOC on 64bit. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-09x86: cpa, use page poolThomas Gleixner
Switch the split page code to use the page pool. We do this unconditionally to avoid different behaviour with and without DEBUG_PAGEALLOC enabled. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-09x86: introduce page pool in cpaThomas Gleixner
DEBUG_PAGEALLOC was not possible on 64-bit due to its early-bootup hardcoded reliance on PSE pages, and the unrobustness of the runtime splitup of large pages. The splitup ended in recursive calls to alloc_pages() when a page for a pte split was requested. Avoid the recursion with a preallocated page pool, which is used to split up large mappings and gets refilled in the return path of kernel_map_pages after the split has been done. The size of the page pool is adjusted to the available memory. This part just implements the page pool and the initialization w/o using it yet. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-09x86 PM: update stale commentsRafael J. Wysocki
In some suspend and hibernation files in arch/x86/power there are comments referring to arch/x86-64 and arch/i386 . Update them to reflect the current code layout. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-09x86 PM: consolidate suspend and hibernation codeRafael J. Wysocki
Move the hibernation-specific code from arch/x86/power/suspend_64.c to a separate file (hibernate_64.c) and the CPU-handling code to cpu_64.c (in line with the corresponding 32-bit code). Simplify arch/x86/power/Makefile . Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-09x86 PM: rename 32-bit files in arch/x86/powerRafael J. Wysocki
Rename cpu.c, suspend.c and swsusp.S in arch/x86/power to cpu_32.c, hibernate_32.c and hibernate_asm_32.S, respectively, and update the purpose and copyright information in these files. Update the Makefile in arch/x86/power to reflect the above changes. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-09x86 PM: move 64-bit hibernation files to arch/x86/powerRafael J. Wysocki
Move arch/x86/kernel/suspend_64.c to arch/x86/power . Move arch/x86/kernel/suspend_asm_64.S to arch/x86/power as hibernate_asm_64.S . Update purpose and copyright information in arch/x86/power/suspend_64.c and arch/x86/power/hibernate_asm_64.S . Update the Makefiles in arch/x86, arch/x86/kernel and arch/x86/power to reflect the above changes. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-09x86: trivial printk optimizationsDenys Vlasenko
In arch/x86/boot/printf.c gets rid of unused tail of digits: const char *digits = "0123456789abcdefghijklmnopqrstuvwxyz"; (we are using 0-9a-f only) Uses smaller/faster lowercasing (by ORing with 0x20) if we know that we work on numbers/digits. Makes strtoul smaller, and also we are getting rid of static const char small_digits[] = "0123456789abcdefx"; static const char large_digits[] = "0123456789ABCDEFX"; since this works equally well: static const char digits[16] = "0123456789ABCDEF"; Size savings: $ size vmlinux.org vmlinux text data bss dec hex filename 877320 112252 90112 1079684 107984 vmlinux.org 877048 112252 90112 1079412 107874 vmlinux It may be also a tiny bit faster because code has less branches now, but I doubt it is measurable. [ hugh@veritas.com: uppercase pointers fix ] Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-09x86: fix early_ioremap pagetable opsIan Campbell
Some important parts of f6df72e71eba621b2f5c49b3a763116fac748f6e got dropped along the way, reintroduce them. Only affects paravirt guests. Signed-off-by: Ian Campbell <ijc@hellion.org.uk> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-09x86: construct 32-bit boot time page tables in native format.Ian Campbell
Specifically the boot time page tables in a CONFIG_X86_PAE=y enabled kernel are in PAE format. early_ioremap is updated to use the standard page table accessors. Clear any mappings beyond max_low_pfn from the boot page tables in native_pagetable_setup_start because the initial mappings can extend beyond the range of physical memory and into the vmalloc area. Derived from patches by Eric Biederman and H. Peter Anvin. [ jeremy@goop.org: PAE swapper_pg_dir needs to be page-sized fix ] Signed-off-by: Ian Campbell <ijc@hellion.org.uk> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Mika Penttilä <mika.penttila@kolumbus.fi> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-09x86, core: remove CONFIG_FORCED_INLININGHarvey Harrison
Other than the defconfigs, remove the entry in compiler-gcc4.h, Kconfig.debug and feature-removal-schedule.txt. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-09x86: avoid unused variable warning in mm/init_64.cThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-09x86: fixup more paravirt falloutIngo Molnar
Use a common irq_return entry point for all the iret places, which need the paravirt INTERRUPT return wrapper. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-09x86: fix sparse warning in topology.cHarvey Harrison
arch/x86/kernel/topology.c:56:2: warning: returning void-valued expression Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-09x86: sparse warnings in pageattr.cHarvey Harrison
Adjust the definition of lookup_address to take an unsigned long level argument. Adjust callers in xen/mmu.c that pass in a dummy variable. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-09x86: sparse warning in therm_throt.cHarvey Harrison
arch/x86/kernel/cpu/mcheck/therm_throt.c:121:2: warning: returning void-valued expression Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-09x86: fix sparse warning in xen/time.cHarvey Harrison
Use xen_khz to denote xen_specific clock speed. Avoid shadowing cpu_khz. arch/x86/xen/time.c:220:6: warning: symbol 'cpu_khz' shadows an earlier one include/asm/tsc.h:17:21: originally declared here Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-09x86: GEODE: MFGPT: fix typo in printk in mfgpt_timer_setupArnd Hannemann
Signed-off-by: Arnd Hannemann <hannemann@i4.informatik.rwth-aachen.de> Signed-off-by: Andres Salomon <dilinger@debian.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-09x86: GEODE: make sure the right MFGPT timer fired the timer tickJordan Crouse
Each AMD Geode MFGPT timer interrupt output is paired with another timer; esentially the interrupt goes if either timer fires. This is okay, but the handlers need to be aware of this. Make sure in the timer tick handler that our timer really did expire. Signed-off-by: Jordan Crouse <jordan.crouse@amd.com> Signed-off-by: Andres Salomon <dilinger@debian.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-09x86: GEODE: MFGPT: fix a potential race when disabling a timerAndres Salomon
We *really* don't want to be reading MFGPTx_SETUP and writing back those values. What we want to be doing is clearing CMP1 and CMP2 unconditionally; otherwise, we have races where CMP1 and/or CMP2 fire after we've read MFGPTx_SETUP. They can also fire between when we've written ~CNTEN to the register, and when the new register values get copied to the timer's version of the register. By clearing both fields, we're okay. Signed-off-by: Andres Salomon <dilinger@debian.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-09x86: GEODE: MFGPT: Use "just-in-time" detection for the MFGPT timersJordan Crouse
There isn't much value to always detecting the MFGPT timers on Geode platforms; detection is only needed when something wants to use the timers. Move the detection code so that it gets called the first time a timer is allocated. Signed-off-by: Jordan Crouse <jordan.crouse@amd.com> Signed-off-by: Andres Salomon <dilinger@debian.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-09x86: GEODE: MFGPT: make mfgpt_timer_setup available outside of mfgpt_32.cAndres Salomon
We need to be called from elsewhere, and this gets some #ifdefs out of the .c file. Signed-off-by: Andres Salomon <dilinger@debian.org> Signed-off-by: Jordan Crouse <jordan.crouse@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-09x86: GEODE: MFGPT: replace 'flags' field with 'avail' bitAndres Salomon
Drop F_AVAIL and the 'flags' field, replacing with an 'avail' bit. This looks more understandable to me. Signed-off-by: Andres Salomon <dilinger@debian.org> Signed-off-by: Jordan Crouse <jordan.crouse@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-09x86: GEODE: MFGPT: drop module owner usage from MFGPT APIAndres Salomon
We had planned to use the 'owner' field for allowing re-allocation of MFGPTs; however, doing it by module owner name isn't flexible enough. So, drop this for now. If it turns out that we need timers in modules, we'll need to come up with a scheme that matches the write-once fields of the MFGPTx_SETUP register, and drops ponies from the sky. Signed-off-by: Andres Salomon <dilinger@debian.org> Signed-off-by: Jordan Crouse <jordan.crouse@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-09x86: GEODE fix MFGPT input clock valueWilly Tarreau
The GEODE MFGPT code assumed that 32kHz was 32000 Hz while the boards run on a 32.768 kHz digital watch crystal. In practise, it will not change the timer's frequency as the skew was only 2.4%, but it should provide more accurate intervals. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Andres Salomon <dilinger@debian.org> Signed-off-by: Jordan Crouse <jordan.crouse@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-09x86: GEODE: MFGPT: Minor cleanupsAndres Salomon
- uninline timer functions; the compiler knows better than we do whether or not to inline these. - mfgpt_start_timer() had an unused 'clock' argument, drop it. From both Jordan and myself. Signed-off-by: Jordan Crouse <jordan.crouse@amd.com> Signed-off-by: Andres Salomon <dilinger@debian.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuildLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild: Kbuild: Fix deb-pkg target to work with kernel versions ending with -<text-without-digit> ide: introduce HAVE_IDE kbuild: silence CHK/UPD messages according to $(quiet) scsi: fix makefile for aic7(3*x) kbuild/modpost: Use warn() for announcing section mismatches Add binoffset to gitignore kbuild/modpost: improve warnings if symbol is unknown
2008-02-09ide: introduce HAVE_IDESam Ravnborg
To allow flexible configuration of IDE introduce HAVE_IDE. All archs except arm, um and s390 unconditionally select it. For arm the actual configuration determine if IDE is supported. This is a step towards introducing drivers/Kconfig for arm. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Acked-by: Russell King - ARM Linux <linux@arm.linux.org.uk> Acked-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-09cpuidle: build fix for non-x86Venki Pallipadi
The last posted version of this patch gave compile error on IA64. So, here goes yet another rewrite of the patch. Convert cpu_idle_wait() to cpuidle_kick_cpus() which is SMP-only, and gives error on non supported CPU. Changes from last patch sent by Kevin: Moved the definition of kick_cpus back to cpuidle.c from cpuidle.h: * Having it in .h gives #error on archs which includes the header file without actually having CPU_IDLE configured. To make it work in .h, we need one more #ifdef around that code which makes it messy. * Also, the function is only called from one file. So, it can be in declared statically in .c rather than making it available to everyone who includes the .h file. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Kevin Hilman <khilman@mvista.com> Signed-off-by: Len Brown <len.brown@intel.com>
2008-02-08CONFIG_HIGHPTE vs. sub-page page tables.Martin Schwidefsky
Background: I've implemented 1K/2K page tables for s390. These sub-page page tables are required to properly support the s390 virtualization instruction with KVM. The SIE instruction requires that the page tables have 256 page table entries (pte) followed by 256 page status table entries (pgste). The pgstes are only required if the process is using the SIE instruction. The pgstes are updated by the hardware and by the hypervisor for a number of reasons, one of them is dirty and reference bit tracking. To avoid wasting memory the standard pte table allocation should return 1K/2K (31/64 bit) and 2K/4K if the process is using SIE. Problem: Page size on s390 is 4K, page table size is 1K or 2K. That means the s390 version for pte_alloc_one cannot return a pointer to a struct page. Trouble is that with the CONFIG_HIGHPTE feature on x86 pte_alloc_one cannot return a pointer to a pte either, since that would require more than 32 bit for the return value of pte_alloc_one (and the pte * would not be accessible since its not kmapped). Solution: The only solution I found to this dilemma is a new typedef: a pgtable_t. For s390 pgtable_t will be a (pte *) - to be introduced with a later patch. For everybody else it will be a (struct page *). The additional problem with the initialization of the ptl lock and the NR_PAGETABLE accounting is solved with a constructor pgtable_page_ctor and a destructor pgtable_page_dtor. The page table allocation and free functions need to call these two whenever a page table page is allocated or freed. pmd_populate will get a pgtable_t instead of a struct page pointer. To get the pgtable_t back from a pmd entry that has been installed with pmd_populate a new function pmd_pgtable is added. It replaces the pmd_page call in free_pte_range and apply_to_pte_range. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08aout: remove unnecessary inclusions of {asm, linux}/a.out.hDavid Howells
Remove now unnecessary inclusions of {asm,linux}/a.out.h. [akpm@linux-foundation.org: fix alpha build] Signed-off-by: David Howells <dhowells@redhat.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08aout: suppress A.OUT library support if !CONFIG_ARCH_SUPPORTS_AOUTDavid Howells
Suppress A.OUT library support if CONFIG_ARCH_SUPPORTS_AOUT is not set. Not all architectures support the A.OUT binfmt, so the ELF binfmt should not be permitted to go looking for A.OUT libraries to load in such a case. Not only that, but under such conditions A.OUT core dumps are not produced either. To make this work, this patch also does the following: (1) Makes the existence of the contents of linux/a.out.h contingent on CONFIG_ARCH_SUPPORTS_AOUT. (2) Renames dump_thread() to aout_dump_thread() as it's only called by A.OUT core dumping code. (3) Moves aout_dump_thread() into asm/a.out-core.h and makes it inline. This is then included only where needed. This means that this bit of arch code will be stored in the appropriate A.OUT binfmt module rather than the core kernel. (4) Drops A.OUT support for Blackfin (according to Mike Frysinger it's not needed) and FRV. This patch depends on the previous patch to move STACK_TOP[_MAX] out of asm/a.out.h and into asm/processor.h as they're required whether or not A.OUT format is available. [jdike@addtoit.com: uml: re-remove accidentally restored code] Signed-off-by: David Howells <dhowells@redhat.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08aout: mark arches that support A.OUT formatDavid Howells
Mark arches that support A.OUT format by including the following in their master Kconfig files: config ARCH_SUPPORTS_AOUT def_bool y This should also be set if the arch provides compatibility A.OUT support for an older arch, for instance x86_64 for i386 or sparc64 for sparc. I've guessed at which arches don't, based on comments in the code, however I'm sure that some of the ones I've marked as 'yes' actually should be 'no'. Signed-off-by: David Howells <dhowells@redhat.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08time: fix typo in commentsLi Zefan
Fix typo in comments. BTW: I have to fix coding style in arch/ia64/kernel/time.c also, otherwise checkpatch.pl will be complaining. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07SLUB: Alternate fast paths using cmpxchg_localChristoph Lameter
Provide an alternate implementation of the SLUB fast paths for alloc and free using cmpxchg_local. The cmpxchg_local fast path is selected for arches that have CONFIG_FAST_CMPXCHG_LOCAL set. An arch should only set CONFIG_FAST_CMPXCHG_LOCAL if the cmpxchg_local is faster than an interrupt enable/disable sequence. This is known to be true for both x86 platforms so set FAST_CMPXCHG_LOCAL for both arches. Currently another requirement for the fastpath is that the kernel is compiled without preemption. The restriction will go away with the introduction of a new per cpu allocator and new per cpu operations. The advantages of a cmpxchg_local based fast path are: 1. Potentially lower cycle count (30%-60% faster) 2. There is no need to disable and enable interrupts on the fast path. Currently interrupts have to be disabled and enabled on every slab operation. This is likely avoiding a significant percentage of interrupt off / on sequences in the kernel. 3. The disposal of freed slabs can occur with interrupts enabled. The alternate path is realized using #ifdef's. Several attempts to do the same with macros and inline functions resulted in a mess (in particular due to the strange way that local_interrupt_save() handles its argument and due to the need to define macros/functions that sometimes disable interrupts and sometimes do something else). [clameter: Stripped preempt bits and disabled fastpath if preempt is enabled] Signed-off-by: Christoph Lameter <clameter@sgi.com> Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>