aboutsummaryrefslogtreecommitdiff
path: root/arch/ppc/mm
AgeCommit message (Collapse)Author
2008-02-08CONFIG_HIGHPTE vs. sub-page page tables.Martin Schwidefsky
Background: I've implemented 1K/2K page tables for s390. These sub-page page tables are required to properly support the s390 virtualization instruction with KVM. The SIE instruction requires that the page tables have 256 page table entries (pte) followed by 256 page status table entries (pgste). The pgstes are only required if the process is using the SIE instruction. The pgstes are updated by the hardware and by the hypervisor for a number of reasons, one of them is dirty and reference bit tracking. To avoid wasting memory the standard pte table allocation should return 1K/2K (31/64 bit) and 2K/4K if the process is using SIE. Problem: Page size on s390 is 4K, page table size is 1K or 2K. That means the s390 version for pte_alloc_one cannot return a pointer to a struct page. Trouble is that with the CONFIG_HIGHPTE feature on x86 pte_alloc_one cannot return a pointer to a pte either, since that would require more than 32 bit for the return value of pte_alloc_one (and the pte * would not be accessible since its not kmapped). Solution: The only solution I found to this dilemma is a new typedef: a pgtable_t. For s390 pgtable_t will be a (pte *) - to be introduced with a later patch. For everybody else it will be a (struct page *). The additional problem with the initialization of the ptl lock and the NR_PAGETABLE accounting is solved with a constructor pgtable_page_ctor and a destructor pgtable_page_dtor. The page table allocation and free functions need to call these two whenever a page table page is allocated or freed. pmd_populate will get a pgtable_t instead of a struct page pointer. To get the pgtable_t back from a pmd entry that has been installed with pmd_populate a new function pmd_pgtable is added. It replaces the pmd_page call in free_pte_range and apply_to_pte_range. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-05add mm argument to pte/pmd/pud/pgd_freeBenjamin Herrenschmidt
(with Martin Schwidefsky <schwidefsky@de.ibm.com>) The pgd/pud/pmd/pte page table allocation functions get a mm_struct pointer as first argument. The free functions do not get the mm_struct argument. This is 1) asymmetrical and 2) to do mm related page table allocations the mm argument is needed on the free function as well. [kamalesh@linux.vnet.ibm.com: i386 fix] [akpm@linux-foundation.org: coding-syle fixes] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-28[PPC] Remove 85xx from arch/ppcKumar Gala
85xx exists in arch/powerpc as well as cuImage support to boot from a u-boot that doesn't support device trees. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-01-28[PPC] Remove 83xx from arch/ppcKumar Gala
83xx exists in arch/powerpc as well as cuImage support to boot from a u-boot that doesn't support device trees. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2007-12-23[POWERPC] 4xx: Fix TLB 0 problem with CONFIG_SERIAL_TEXT_DEBUGStefan Roese
Right now TLB entry 0 ist used as UART0 mapping for the early debug output (via CONFIG_SERIAL_TEXT_DEBUG). This causes problems when many TLB's get used upon Linux bootup (e.g. while PCIe scanning behind bridges and/or switches on 440SPe platforms). This will overwrite the TLB 0 entry and further debug output's may crash/hang the system. This patch moves the early debug UART0 TLB entry from 0 to 62 as done in arch/powerpc. This way it is in the "pinned" area and will not get overwritten. Also the arch/ppc/mm/44x_mmu.c code is now synced with the newer code from arch/powerpc. Signed-off-by: Stefan Roese <sr@denx.de> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2007-11-20[POWERPC] Fix 8xx build breakage due to _tlbie changesBenjamin Herrenschmidt
My changes to _tlbie to fix 4xx unfortunately broke 8xx build in a couple of places. This fixes it. Spotted by Olof Johansson. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Vitaly Bordug <vitb@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-11-01[POWERPC] ppc405 Fix arithmatic rollover bug when memory size under 16MGrant Likely
mmu_mapin_ram() loops over total_lowmem to setup page tables. However, if total_lowmem is less that 16M, the subtraction rolls over and results in a number just under 4G (because total_lowmem is an unsigned value). This patch rejigs the loop from countup to countdown to eliminate the bug. Special thanks to Magnus Hjorth who wrote the original patch to fix this bug. This patch improves on his by making the loop code simpler (which also eliminates the possibility of another rollover at the high end) and also applies the change to arch/powerpc. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2007-11-01[POWERPC] 4xx: Deal with 44x virtually tagged icacheBenjamin Herrenschmidt
The 44x family has an interesting "feature" which is a virtually tagged instruction cache (yuck !). So far, we haven't dealt with it properly, which means we've been mostly lucky or people didn't report the problems, unless people have been running custom patches in their distro... This is an attempt at fixing it properly. I chose to do it by setting a global flag whenever we change a PTE that was previously marked executable, and flush the entire instruction cache upon return to user space when that happens. This is a bit heavy handed, but it's hard to do more fine grained flushes as the icbi instruction, on those processor, for some very strange reasons (since the cache is virtually mapped) still requires a valid TLB entry for reading in the target address space, which isn't something I want to deal with. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2007-11-01[POWERPC] 4xx: Fix 4xx flush_tlb_page()Benjamin Herrenschmidt
On 4xx CPUs, the current implementation of flush_tlb_page() uses a low level _tlbie() assembly function that only works for the current PID. Thus, invalidations caused by, for example, a COW fault triggered by get_user_pages() from a different context will not work properly, causing among other things, gdb breakpoints to fail. This patch adds a "pid" argument to _tlbie() on 4xx processors, and uses it to flush entries in the right context. FSL BookE also gets the argument but it seems they don't need it (their tlbivax form ignores the PID when invalidating according to the document I have). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2007-10-19pid namespaces: define is_global_init() and is_container_init()Serge E. Hallyn
is_init() is an ambiguous name for the pid==1 check. Split it into is_global_init() and is_container_init(). A cgroup init has it's tsk->pid == 1. A global init also has it's tsk->pid == 1 and it's active pid namespace is the init_pid_ns. But rather than check the active pid namespace, compare the task structure with 'init_pid_ns.child_reaper', which is initialized during boot to the /sbin/init process and never changes. Changelog: 2.6.22-rc4-mm2-pidns1: - Use 'init_pid_ns.child_reaper' to determine if a given task is the global init (/sbin/init) process. This would improve performance and remove dependence on the task_pid(). 2.6.21-mm2-pidns2: - [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc, ppc,avr32}/traps.c for the _exception() call to is_global_init(). This way, we kill only the cgroup if the cgroup's init has a bug rather than force a kernel panic. [akpm@linux-foundation.org: fix comment] [sukadev@us.ibm.com: Use is_global_init() in arch/m32r/mm/fault.c] [bunk@stusta.de: kernel/pid.c: remove unused exports] [sukadev@us.ibm.com: Fix capability.c to work with threaded init] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Acked-by: Pavel Emelianov <xemul@openvz.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Herbert Poetzel <herbert@13thfloor.at> Cc: Kirill Korotaev <dev@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16During VM oom condition, kill all threads in process groupWill Schmidt
We have had complaints where a threaded application is left in a bad state after one of it's threads is killed when we hit a VM: out_of_memory condition. Killing just one of the process threads can leave the application in a bad state, whereas killing the entire process group would allow for the application to restart, or be otherwise handled, and makes it very obvious that something has gone wrong. This change allows the entire process group to be taken down, rather than just the one thread. Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com> Cc: Haavard Skinnemoen <hskinnemoen@atmel.com> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Andi Kleen <ak@suse.de> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Matthew Wilcox <willy@debian.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Richard Curnow <rc@rc0.org.uk> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-09-17[POWERPC] Remove APUS support from arch/ppcAdrian Bunk
Current status of APUS: - arch/powerpc/: removed in 2.6.23 - arch/ppc/: marked BROKEN since 2 years This therefore removes the remaining parts of APUS support from arch/ppc, include/asm-ppc, arch/powerpc and include/asm-powerpc. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-07-19mm: fault feedback #2Nick Piggin
This patch completes Linus's wish that the fault return codes be made into bit flags, which I agree makes everything nicer. This requires requires all handle_mm_fault callers to be modified (possibly the modifications should go further and do things like fault accounting in handle_mm_fault -- however that would be for another patch). [akpm@linux-foundation.org: fix alpha build] [akpm@linux-foundation.org: fix s390 build] [akpm@linux-foundation.org: fix sparc build] [akpm@linux-foundation.org: fix sparc64 build] [akpm@linux-foundation.org: fix ia64 build] Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com> Cc: Bryan Wu <bryan.wu@analog.com> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Greg Ungerer <gerg@uclinux.org> Cc: Matthew Wilcox <willy@debian.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Richard Curnow <rc@rc0.org.uk> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp> Cc: Chris Zankel <chris@zankel.net> Acked-by: Kyle McMartin <kyle@mcmartin.ca> Acked-by: Haavard Skinnemoen <hskinnemoen@atmel.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Acked-by: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ Still apparently needs some ARM and PPC loving - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-10[PPC] Add linux/pagemap.h to arch/ppc/mm/tlb.cLaurent Pinchart
When compiled without swap support, arch/mm/tlb.c complains about missing function declarations. This patch fixes the warnings. Signed-off-by: Laurent Pinchart <laurent.pinchart@technotrade.biz> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2007-05-23[PPC] Fix modpost warningKumar Gala
Mark pte_alloc_one_kernel as __init_refok to fix the following warning: WARNING: arch/ppc/mm/built-in.o(.text+0x1114): Section mismatch: reference to .init.text:early_get_page (between 'pte_alloc_one_kernel' and 'v_mapped_by_tlbcam') Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2007-05-23[PPC] Fix COMMON symbol warningsKumar Gala
We get the following warnings in various ARCH=ppc builds: WARNING: "ee_restarts" [arch/ppc/kernel/built-in] is COMMON symbol WARNING: "fee_restarts" [arch/ppc/kernel/built-in] is COMMON symbol WARNING: "htab_hash_searches" [arch/ppc/mm/built-in] is COMMON symbol WARNING: "next_slot" [arch/ppc/mm/built-in] is COMMON symbol WARNING: "mmu_hash_lock" [arch/ppc/mm/built-in] is COMMON symbol WARNING: "primary_pteg_full" [arch/ppc/mm/built-in] is COMMON symbol WARNING: "global_dbcr0" [arch/ppc/kernel/built-in] is COMMON symbol Switch to local symbols for ee_restarts, fee_restarts, and global_dbcr0 and global symbols for mmu_hash_lock, next_slot, primary_pteg_full, and htab_hash_searches. (except mmu_hash_lock which is global) and space directive instead. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2007-05-12[POWERPC] Spelling fixes: arch/ppc/Simon Arlott
Spelling fixes in arch/ppc/. Signed-off-by: Simon Arlott <simon@fire.lp0.eu> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-02-09[POWERPC] Fix is_power_of_4(x) compile errorKumar Gala
When building an 85xx kernel we get: CC arch/powerpc/mm/pgtable_32.o arch/powerpc/mm/pgtable_32.c: In function 'io_block_mapping': arch/powerpc/mm/pgtable_32.c:330: error: expected identifier before '(' token arch/powerpc/mm/pgtable_32.c:330: error: expected statement before ')' token The is_power_of_2(x) fixup patch left an extra ')' on the is_power_of_4 macro. There is a similiar issue on the arch/ppc side. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2007-02-07[POWERPC] Add "is_power_of_2" checking to log2.h.Robert P. J. Day
Add the inline function "is_power_of_2()" to log2.h, where the value zero is *not* considered to be a power of two. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Acked-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-10-11[PATCH] mm: use symbolic names instead of indices for zone initialisationMel Gorman
Arch-independent zone-sizing is using indices instead of symbolic names to offset within an array related to zones (max_zone_pfns). The unintended impact is that ZONE_DMA and ZONE_NORMAL is initialised on powerpc instead of ZONE_DMA and ZONE_HIGHMEM when CONFIG_HIGHMEM is set. As a result, the the machine fails to boot but will boot with CONFIG_HIGHMEM turned off. The following patch properly initialises the max_zone_pfns[] array and uses symbolic names instead of indices in each architecture using arch-independent zone-sizing. Two users have successfully booted their powerpcs with it (one an ibook G4). It has also been boot tested on x86, x86_64, ppc64 and ia64. Please merge for 2.6.19-rc2. Credit to Benjamin Herrenschmidt for identifying the bug and rolling the first fix. Additional credit to Johannes Berg and Andreas Schwab for reporting the problem and testing on powerpc. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] pidspace: is_init()Sukadev Bhattiprolu
This is an updated version of Eric Biederman's is_init() patch. (http://lkml.org/lkml/2006/2/6/280). It applies cleanly to 2.6.18-rc3 and replaces a few more instances of ->pid == 1 with is_init(). Further, is_init() checks pid and thus removes dependency on Eric's other patches for now. Eric's original description: There are a lot of places in the kernel where we test for init because we give it special properties. Most significantly init must not die. This results in code all over the kernel test ->pid == 1. Introduce is_init to capture this case. With multiple pid spaces for all of the cases affected we are looking for only the first process on the system, not some other process that has pid == 1. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: <lxc-devel@lists.sourceforge.net> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] make PROT_WRITE imply PROT_READJason Baron
Make PROT_WRITE imply PROT_READ for a number of architectures which don't support write only in hardware. While looking at this, I noticed that some architectures which do not support write only mappings already take the exact same approach. For example, in arch/alpha/mm/fault.c: " if (cause < 0) { if (!(vma->vm_flags & VM_EXEC)) goto bad_area; } else if (!cause) { /* Allow reads even for write-only mappings */ if (!(vma->vm_flags & (VM_READ | VM_WRITE))) goto bad_area; } else { if (!(vma->vm_flags & VM_WRITE)) goto bad_area; } " Thus, this patch brings other architectures which do not support write only mappings in-line and consistent with the rest. I've verified the patch on ia64, x86_64 and x86. Additional discussion: Several architectures, including x86, can not support write-only mappings. The pte for x86 reserves a single bit for protection and its two states are read only or read/write. Thus, write only is not supported in h/w. Currently, if i 'mmap' a page write-only, the first read attempt on that page creates a page fault and will SEGV. That check is enforced in arch/blah/mm/fault.c. However, if i first write that page it will fault in and the pte will be set to read/write. Thus, any subsequent reads to the page will succeed. It is this inconsistency in behavior that this patch is attempting to address. Furthermore, if the page is swapped out, and then brought back the first read will also cause a SEGV. Thus, any arbitrary read on a page can potentially result in a SEGV. According to the SuSv3 spec, "if the application requests only PROT_WRITE, the implementation may also allow read access." Also as mentioned, some archtectures, such as alpha, shown above already take the approach that i am suggesting. The counter-argument to this raised by Arjan, is that the kernel is enforcing the write only mapping the best it can given the h/w limitations. This is true, however Alan Cox, and myself would argue that the inconsitency in behavior, that is applications can sometimes work/sometimes fails is highly undesireable. If you read through the thread, i think people, came to an agreement on the last patch i posted, as nobody has objected to it... Signed-off-by: Jason Baron <jbaron@redhat.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Andi Kleen <ak@muc.de> Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Ian Molton <spyro@f2s.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27[PATCH] Have Power use add_active_range() and free_area_init_nodes()Mel Gorman
Size zones and holes in an architecture independent manner for Power. [judith@osdl.org: build fix] Signed-off-by: Mel Gorman <mel@csn.ul.ie> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Andi Kleen <ak@muc.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Keith Mannthey" <kmannth@gmail.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30Remove obsolete #include <linux/config.h>Jörn Engel
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-13powerpc: Fix some missed ppc32 mm->context.id conversionsPaul Mackerras
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-06-11powerpc: add context.vdso_base for 32-bit tooPaul Mackerras
This adds a vdso_base element to the mm_context_t for 32-bit compiles (both for ARCH=powerpc and ARCH=ppc). This fixes the compile errors that have been reported in arch/powerpc/kernel/signal_32.c. Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-29[PATCH] lock PTE before updating it in 440/BookE page fault handlerEugene Surovegin
Fix 44x and BookE page fault handler to correctly lock PTE before trying to pte_update() it, otherwise this PTE might be swapped out after pte_present() check but before pte_uptdate() call, resulting in corrupted PTE. This can happen with enabled preemption and low memory condition. Signed-off-by: Eugene Surovegin <ebs@ebshome.net> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-28ppc: Remove CHRP, POWER3 and POWER4 support from arch/ppcPaul Mackerras
32-bit CHRP machines are now supported only in arch/powerpc, as are all 64-bit PowerPC processors. This means that we don't use Open Firmware on any platform in arch/ppc any more. This makes PReP support a single-platform option like every other platform support option in arch/ppc now, thus CONFIG_PPC_MULTIPLATFORM is gone from arch/ppc. CONFIG_PPC_PREP is the option that selects PReP support and is generally what has replaced CONFIG_PPC_MULTIPLATFORM within arch/ppc. _machine is all but dead now, being #defined to 0. Updated Makefiles, comments and Kconfig options generally to reflect these changes. Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpcLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (78 commits) [PATCH] powerpc: Add FSL SEC node to documentation [PATCH] macintosh: tidy-up driver_register() return values [PATCH] powerpc: tidy-up of_register_driver()/driver_register() return values [PATCH] powerpc: via-pmu warning fix [PATCH] macintosh: cleanup the use of i2c headers [PATCH] powerpc: dont allow old RTC to be selected [PATCH] powerpc: make powerbook_sleep_grackle static [PATCH] powerpc: Fix warning in add_memory [PATCH] powerpc: update mailing list addresses [PATCH] powerpc: Remove calculation of io hole [PATCH] powerpc: iseries: Add bootargs to /chosen [PATCH] powerpc: iseries: Add /system-id, /model and /compatible [PATCH] powerpc: Add strne2a() to convert a string from EBCDIC to ASCII [PATCH] powerpc: iseries: Make more stuff static in platforms/iseries/mf.c [PATCH] powerpc: iseries: Remove pointless iSeries_(restart|power_off|halt) [PATCH] powerpc: iseries: mf related cleanups [PATCH] powerpc: Replace platform_is_lpar() with a firmware feature [PATCH] powerpc: trivial: Cleanup whitespace in cputable.h [PATCH] powerpc: Remove unused iommu_off logic from pSeries_init_early() [PATCH] powerpc: Unconfuse htab_bolt_mapping() callers ...
2006-03-22[PATCH] remove set_page_count() outside mm/Nick Piggin
set_page_count usage outside mm/ is limited to setting the refcount to 1. Remove set_page_count from outside mm/, and replace those users with init_page_count() and set_page_refcounted(). This allows more debug checking, and tighter control on how code is allowed to play around with page->_count. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-10[PATCH] powerpc: trivial: modify comments to refer to new location of filesJon Mason
This patch removes all self references and fixes references to files in the now defunct arch/ppc64 tree. I think this accomplises everything wanted, though there might be a few references I missed. Signed-off-by: Jon Mason <jdmason@us.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-20[PATCH] powerpc: generalize PPC44x_PIN_SIZEMarcelo Tosatti
The following patch generalizes PPC44x_PIN_SIZE by changing it to PPC_PIN_SIZE, which can be defined by any sub-arch to automatically adjust VMALLOC_START. Define PPC_PIN_SIZE on 8xx, avoiding potential conflicts with the pinned space. Signed-off-by: Marcelo Tosatti <marcelo.tosatti@cyclades.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-15[PATCH] ppc: Remove powermac support from ARCH=ppcPaul Mackerras
This makes it possible to build kernels for PReP and/or CHRP with ARCH=ppc by removing the (non-building) powermac support. It's now also possible to select PReP and CHRP independently. Powermac users should now build with ARCH=powerpc instead of ARCH=ppc. (This does mean that it is no longer possible to build a 32-bit kernel for a G5.) Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-16[PATCH] ppc32 8xx: update_mmu_cache() needs unconditional tlbieMarcelo Tosatti
Currently 8xx fails to boot due to endless pagefaults. Seems the bug is exposed by the change which avoids flushing the TLB when not necessary (in case the pte has not changed), introduced recently: __handle_mm_fault(): entry = pte_mkyoung(entry); if (!pte_same(old_entry, entry)) { ptep_set_access_flags(vma, address, pte, entry, write_access); update_mmu_cache(vma, address, entry); lazy_mmu_prot_update(entry); } else { /* * This is needed only for protection faults but the arch code * is not yet telling us if this is a protection fault or not. * This still avoids useless tlb flushes for .text page faults * with threads. */ if (write_access) flush_tlb_page(vma, address); } The "update_mmu_cache()" call was unconditional before, which caused the TLB to be flushed by: if (pfn_valid(pfn)) { struct page *page = pfn_to_page(pfn); if (!PageReserved(page) && !test_bit(PG_arch_1, &page->flags)) { if (vma->vm_mm == current->active_mm) { #ifdef CONFIG_8xx /* On 8xx, cache control instructions (particularly * "dcbst" from flush_dcache_icache) fault as write * operation if there is an unpopulated TLB entry * for the address in question. To workaround that, * we invalidate the TLB here, thus avoiding dcbst * misbehaviour. */ _tlbie(address); #endif __flush_dcache_icache((void *) address); } else flush_dcache_icache_page(page); set_bit(PG_arch_1, &page->flags); } Which worked to due to pure luck: PG_arch_1 was always unset before, but now it isnt. The root of the problem are the changes against the 8xx TLB handlers introduced during v2.6. What happens is the TLBMiss handlers load the zeroed pte into the TLB, causing the TLBError handler to be invoked (thats two TLB faults per pagefault), which then jumps to the generic MM code to setup the pte. The bug is that the zeroed TLB is not invalidated (the same reason for the "dcbst" misbehaviour), resulting in infinite TLBError faults. The "two exception" approach requires a TLB flush (to nuke the zeroed TLB) at each PTE update for correct behaviour: Signed-off-by: Marcelo Tosatti <marcelo.tosatti@cyclades.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-13[PATCH] Update email address for KumarKumar Gala
Changed jobs and the Freescale address is no longer valid. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-31Merge ../linux-2.6 by handPaul Mackerras
2005-10-29[PATCH] mm: init_mm without ptlockHugh Dickins
First step in pushing down the page_table_lock. init_mm.page_table_lock has been used throughout the architectures (usually for ioremap): not to serialize kernel address space allocation (that's usually vmlist_lock), but because pud_alloc,pmd_alloc,pte_alloc_kernel expect caller holds it. Reverse that: don't lock or unlock init_mm.page_table_lock in any of the architectures; instead rely on pud_alloc,pmd_alloc,pte_alloc_kernel to take and drop it when allocating a new one, to check lest a racing task already did. Similarly no page_table_lock in vmalloc's map_vm_area. Some temporary ugliness in __pud_alloc and __pmd_alloc: since they also handle user mms, which are converted only by a later patch, for now they have to lock differently according to whether or not it's init_mm. If sources get muddled, there's a danger that an arch source taking init_mm.page_table_lock will be mixed with common source also taking it (or neither take it). So break the rules and make another change, which should break the build for such a mismatch: remove the redundant mm arg from pte_alloc_kernel (ppc64 scrapped its distinct ioremap_mm in 2.6.13). Exceptions: arm26 used pte_alloc_kernel on user mm, now pte_alloc_map; ia64 used pte_alloc_map on init_mm, now pte_alloc_kernel; parisc had bad args to pmd_alloc and pte_alloc_kernel in unused USE_HPPA_IOREMAP code; ppc64 map_io_page forgot to unlock on failure; ppc mmu_mapin_ram and ppc64 im_free took page_table_lock for no good reason. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29[PATCH] ppc: make phys_mem_access_prot() work with pfns instead of addressesRoland Dreier
Change the phys_mem_access_prot() function to take a pfn instead of an address. This allows mmap64() to work on /dev/mem for addresses above 4G on 32-bit architectures. We start with a pfn in mmap_mem(), so there's no need to convert to an address; in fact, it's actively bad, since the conversion can overflow when the address is above 4G. Similarly fix the ppc32 page_is_ram() function to avoid a conversion to an address by directly comparing to max_pfn. Working with max_pfn instead of high_memory fixes page_is_ram() to give the right answer for highmem pages. Signed-off-by: Roland Dreier <rolandd@cisco.com> Cc: Anton Blanchard <anton@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-10-28[PATCH] gfp_t: remaining bits of arch/*Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-22powerpc: Move agp_special_page export to where it is definedPaul Mackerras
... instead of exporting it in arch/*/kernel/ppc_ksyms.c. Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-09-19[PATCH] powerpc: Remove section free() and linker script bitsJon Loeliger
Here is a new patch that removes all notion of the pmac, prep, chrp and openfirmware initialization sections, and then unifies the sections.h files without those __pmac, etc, sections identifiers cluttering things up. Signed-off-by: Jon Loeliger <jdl@freescale.com> Signed-off-by: Kumar Gala <kumar.gala@freescale.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-09-10[PATCH] ppc32: Kill init on unhandled synchronous signalsPaul Mackerras
This is a patch that I have had in my tree for ages. If init causes an exception that raises a signal, such as a SIGSEGV, SIGILL or SIGFPE, and it hasn't registered a handler for it, we don't deliver the signal, since init doesn't get any signals that it doesn't have a handler for. But that means that we just return to userland and generate the same exception again immediately. With this patch we print a message and kill init in this situation. This is very useful when you have a bug in the kernel that means that init doesn't get as far as executing its first instruction. :) Without this patch the system hangs when it gets to starting the userland init; with it you at least get a message giving you a clue about what has gone wrong. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09kbuild: m68k,parisc,ppc,ppc64,s390,xtensa use generic asm-offsets.h supportSam Ravnborg
Delete obsoleted parts form arch makefiles and rename to asm-offsets.h Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2005-09-05[PATCH] ppc32: defconfig for Marvell EV64360BP boardLee Nicks
Here is the default configuration for Marvell EV64360BP board. It has been tested on the board. Signed-off-by: Lee Nicks <allinux@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05[PATCH] ppc32: Don't sleep in flush_dcache_icache_page()Roland Dreier
flush_dcache_icache_page() will be called on an instruction page fault. We can't sleep in the fault handler, so use kmap_atomic() instead of just kmap() for the Book-E case. Signed-off-by: Roland Dreier <rolandd@cisco.com> Acked-by: Matt Porter <mporter@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27[PATCH] ppc32: 8xx avoid icbi misbehaviour in __flush_dcache_icache_physAnton Wöllert
On 8xx, in the case where a pagefault happens for a process who's not the owner of the vma in question (ptrace for instance), the flush operation is performed via the physical address. Unfortunately, that results in a strange, unexplainable "icbi" instruction fault, most likely due to a CPU bug (see oops below). Avoid that by flushing the page via its kernel virtual address. Oops: kernel access of bad area, sig: 11 [#2] NIP: C000543C LR: C000B060 SP: C0F35DF0 REGS: c0f35d40 TRAP: 0300 Not tainted MSR: 00009022 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 10 DAR: 00000010, DSISR: C2000000 TASK = c0ea8430[761] 'gdbserver' THREAD: c0f34000 Last syscall: 26 GPR00: 00009022 C0F35DF0 C0EA8430 00F59000 00000100 FFFFFFFF 00F58000 00000001 GPR08: C021DAEF C0270000 00009032 C0270000 22044024 10025428 01000800 00000001 GPR16: 007FFF3F 00000001 00000000 7FBC6AC0 00F61022 00000001 C0839300 C01E0000 GPR24: 00CD0889 C082F568 3000AC18 C02A7A00 C0EA15C8 00F588A9 C02ACB00 C02ACB00 NIP [c000543c] __flush_dcache_icache_phys+0x38/0x54 LR [c000b060] flush_dcache_icache_page+0x20/0x30 Call trace: [c000b154] update_mmu_cache+0x7c/0xa4 [c005ae98] do_wp_page+0x460/0x5ec [c005c8a0] handle_mm_fault+0x7cc/0x91c [c005ccec] get_user_pages+0x2fc/0x65c [c0027104] access_process_vm+0x9c/0x1d4 [c00076e0] sys_ptrace+0x240/0x4a4 [c0002bd0] ret_from_syscall+0x0/0x44 Signed-off-by: Marcelo Tosatti <marcelo.tosatti@cyclades.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27[PATCH] 8xx: avoid "dcbst" misbehaviour with unpopulated TLBMarcelo Tosatti
The proposed _tlbie call at update_mmu_cache() is safe because: Addresses for which update_mmu_cache() gets invocated are never inside the static kernel virtual mapping, meaning that there is no risk for the _tlbie() here to be thrashing the pinned entry, as Dan suspected. The intermediate TLB state in which this bug can be triggered is not visible by userspace or any other contexts, except the page fault handling path. So there is no need to worry about userspace dcbxxx users. The other solution to this is to avoid dcbst misbehaviour in the first place, which involves changing in-kernel "dcbst" callers to use 8xx specific SPR's. Summary: On 8xx, cache control instructions (particularly "dcbst" from flush_dcache_icache) fault as write operation if there is an unpopulated TLB entry for the address in question. To workaround that, we invalidate the TLB here, thus avoiding dcbst misbehaviour. Signed-off-by: Marcelo Tosatti <marcelo.tosatti@cyclades.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25[PATCH] ppc32: remove some unnecessary includes of bootmem.hKumar Gala
Continue the Good Fight: Limit bootmem.h include creep. Signed-off-by: Jon Loeliger <jdl@freescale.com> Signed-off-by: Kumar Gala <kumar.gala@freescale.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25[PATCH] ppc32: Add support for Freescale e200 (Book-E) coreKumar Gala
The e200 core is a Book-E core (similar to e500) that has a unified L1 cache and is not cache coherent on the bus. The e200 core also adds a separate exception level for debug exceptions. Part of this patch helps to cleanup a few cases that are true for all Freescale Book-E parts, not just e500. Signed-off-by: Kim Phillips <kim.phillips@freescale.com> Signed-off-by: Kumar Gala <kumar.gala@freescale.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21[PATCH] ppc32: Kill embedded system.map, use kallsymsBenjamin Herrenschmidt
This patch kills the whole embedded System.map mecanism and the bootloader-passed System.map that was used to provide symbol resolution in xmon. Instead, xmon now uses kallsyms like ppc64 does. No hurry getting that in Linus tree, let it be tested in -mm for a while first and make sure it doesn't break various embedded configs. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>