aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2006-01-08[PATCH] Swap Migration V5: migrate_pages() functionChristoph Lameter
This adds the basic page migration function with a minimal implementation that only allows the eviction of pages to swap space. Page eviction and migration may be useful to migrate pages, to suspend programs or for remapping single pages (useful for faulty pages or pages with soft ECC failures) The process is as follows: The function wanting to migrate pages must first build a list of pages to be migrated or evicted and take them off the lru lists via isolate_lru_page(). isolate_lru_page determines that a page is freeable based on the LRU bit set. Then the actual migration or swapout can happen by calling migrate_pages(). migrate_pages does its best to migrate or swapout the pages and does multiple passes over the list. Some pages may only be swappable if they are not dirty. migrate_pages may start writing out dirty pages in the initial passes over the pages. However, migrate_pages may not be able to migrate or evict all pages for a variety of reasons. The remaining pages may be returned to the LRU lists using putback_lru_pages(). Changelog V4->V5: - Use the lru caches to return pages to the LRU Changelog V3->V4: - Restructure code so that applying patches to support full migration does require minimal changes. Rename swapout_pages() to migrate_pages(). Changelog V2->V3: - Extract common code from shrink_list() and swapout_pages() Signed-off-by: Mike Kravetz <kravetz@us.ibm.com> Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: "Michael Kerrisk" <mtk-manpages@gmx.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08[PATCH] Swap Migration V5: PF_SWAPWRITE to allow writing to swapChristoph Lameter
Add PF_SWAPWRITE to control a processes permission to write to swap. - Use PF_SWAPWRITE in may_write_to_queue() instead of checking for kswapd and pdflush - Set PF_SWAPWRITE flag for kswapd and pdflush Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08[PATCH] Swap Migration V5: LRU operationsChristoph Lameter
This is the start of the `swap migration' patch series. Swap migration allows the moving of the physical location of pages between nodes in a numa system while the process is running. This means that the virtual addresses that the process sees do not change. However, the system rearranges the physical location of those pages. The main intent of page migration patches here is to reduce the latency of memory access by moving pages near to the processor where the process accessing that memory is running. The patchset allows a process to manually relocate the node on which its pages are located through the MF_MOVE and MF_MOVE_ALL options while setting a new memory policy. The pages of process can also be relocated from another process using the sys_migrate_pages() function call. Requires CAP_SYS_ADMIN. The migrate_pages function call takes two sets of nodes and moves pages of a process that are located on the from nodes to the destination nodes. Manual migration is very useful if for example the scheduler has relocated a process to a processor on a distant node. A batch scheduler or an administrator can detect the situation and move the pages of the process nearer to the new processor. sys_migrate_pages() could be used on non-numa machines as well, to force all of a particualr process's pages out to swap, if someone thinks that's useful. Larger installations usually partition the system using cpusets into sections of nodes. Paul has equipped cpusets with the ability to move pages when a task is moved to another cpuset. This allows automatic control over locality of a process. If a task is moved to a new cpuset then also all its pages are moved with it so that the performance of the process does not sink dramatically (as is the case today). Swap migration works by simply evicting the page. The pages must be faulted back in. The pages are then typically reallocated by the system near the node where the process is executing. For swap migration the destination of the move is controlled by the allocation policy. Cpusets set the allocation policy before calling sys_migrate_pages() in order to move the pages as intended. No allocation policy changes are performed for sys_migrate_pages(). This means that the pages may not faulted in to the specified nodes if no allocation policy was set by other means. The pages will just end up near the node where the fault occurred. There's another patch series in the pipeline which implements "direct migration". The direct migration patchset extends the migration functionality to avoid going through swap. The destination node of the relation is controllable during the actual moving of pages. The crutch of using the allocation policy to relocate is not necessary and the pages are moved directly to the target. Its also faster since swap is not used. And sys_migrate_pages() can then move pages directly to the specified node. Implement functions to isolate pages from the LRU and put them back later. This patch: An earlier implementation was provided by Hirokazu Takahashi <taka@valinux.co.jp> and IWAMOTO Toshihiro <iwamoto@valinux.co.jp> for the memory hotplug project. From: Magnus This breaks out isolate_lru_page() and putpack_lru_page(). Needed for swap migration. Signed-off-by: Magnus Damm <magnus.damm@gmail.com> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08[PATCH] add schedule_on_each_cpu()Christoph Lameter
swap migration's isolate_lru_page() currently uses an IPI to notify other processors that the lru caches need to be drained if the page cannot be found on the LRU. The IPI interrupt may interrupt a processor that is just processing lru requests and cause a race condition. This patch introduces a new function run_on_each_cpu() that uses the keventd() to run the LRU draining on each processor. Processors disable preemption when dealing the LRU caches (these are per processor) and thus executing LRU draining from another process is safe. Thanks to Lee Schermerhorn <lee.schermerhorn@hp.com> for finding this race condition. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08[PATCH] Make high and batch sizes of per_cpu_pagelists configurableRohit Seth
As recently there has been lot of traffic on the right values for batch and high water marks for per_cpu_pagelists. This patch makes these two variables configurable through /proc interface. A new tunable /proc/sys/vm/percpu_pagelist_fraction is added. This entry controls the fraction of pages at most in each zone that are allocated for each per cpu page list. The min value for this is 8. It means that we don't allow more than 1/8th of pages in each zone to be allocated in any single per_cpu_pagelist. The batch value of each per cpu pagelist is also updated as a result. It is set to pcp->high/4. The upper limit of batch is (PAGE_SHIFT * 8) Signed-off-by: Rohit Seth <rohit.seth@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08[PATCH] drop-pagecacheAndrew Morton
Add /proc/sys/vm/drop_caches. When written to, this will cause the kernel to discard as much pagecache and/or reclaimable slab objects as it can. THis operation requires root permissions. It won't drop dirty data, so the user should run `sync' first. Caveats: a) Holds inode_lock for exorbitant amounts of time. b) Needs to be taught about NUMA nodes: propagate these all the way through so the discarding can be controlled on a per-node basis. This is a debugging feature: useful for getting consistent results between filesystem benchmarks. We could possibly put it under a config option, but it's less than 300 bytes. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08[PATCH] slab: remove unused align parameter from alloc_percpuPekka Enberg
__alloc_percpu and alloc_percpu both take an 'align' argument which is completely ignored. snmp6_mib_init() in net/ipv6/af_inet6.c attempts to use it, but it will be ignored. Therefore, remove the 'align' argument and fixup the lone caller. Signed-off-by: Matthew Dobson <colpatch@us.ibm.com> Acked-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08[PATCH] Fix compilation with CONFIG_MEMORY_HOTPLUG=y and gcc41.Olaf Hering
Fix compilation with CONFIG_MEMORY_HOTPLUG=y and gcc41. Also remove unneeded declations, add a public function. drivers/base/memory.c:53: error: static declaration of 'register_memory_notifier' follows non-static declaration include/linux/memory.h:85: error: previous declaration of 'register_memory_notifier' was here drivers/base/memory.c:58: error: static declaration of 'unregister_memory_notifier' follows non-static declaration include/linux/memory.h:86: error: previous declaration of 'unregister_memory_notifier' was here drivers/base/memory.c:68: error: static declaration of 'register_memory' follows non-static declaration include/linux/memory.h:73: error: previous declaration of 'register_memory' was here Signed-off-by: Olaf Hering <olh@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08[PATCH] asm-generic/atomic.h needs types.hAndrew Morton
For BITS_PER_LONG Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-09[PATCH] powerpc: G4+ oprofile supportAndy Fleming
This patch adds oprofile support for the 7450 and all its multitudinous derivatives. * Added 7450 (and derivatives) support for oprofile * Changed e500 cputable to have oprofile model and cpu_type fields * Added support for classic 32-bit performance monitor interrupt * Cleaned up common powerpc oprofile code to be as common as possible * Cleaned up oprofile_impl.h to reflect 32 bit classic code * Added 32-bit MMCRx bitfield definitions and SPR numbers Signed-off-by: Andy Fleming <afleming@freescale.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: pci_address_to_pio fixBenjamin Herrenschmidt
This fixes pci_address_to_pio() to return an unsigned long (to be safe) and fixes a bug in the implementation that caused it to return a bogus IO port number Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: Replace VMALLOCBASE with VMALLOC_STARTDavid Gibson
On ppc64, we independently define VMALLOCBASE and VMALLOC_START to be the same thing: the start of the vmalloc() area at 0xd000000000000000. VMALLOC_START is used much more widely, including in generic code, so this patch gets rid of the extraneous VMALLOCBASE. This does require moving the definitions of region IDs from page_64.h to pgtable.h, but they don't clearly belong in the former rather than the latter, anyway. While we're moving them, clean up the definitions of the REGION_IDs: - Abolish REGION_SIZE, it was only used once, to define REGION_MASK anyway - Define the specific region ids in terms of the REGION_ID() macro. - Define KERNEL_REGION_ID in terms of PAGE_OFFSET rather than KERNELBASE. It amounts to the same thing, but conceptually this is about the region of the linear mapping (which starts at PAGE_OFFSET) rather than of the kernel text itself (which is at KERNELBASE). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: Experimental support for new G5 Macs (#2)Benjamin Herrenschmidt
This adds some very basic support for the new machines, including the Quad G5 (tested), and other new dual core based machines and iMac G5 iSight (untested). This is still experimental ! There is no thermal control yet, there is no proper handing of MSIs, etc.. but it boots, I have all 4 cores up on my machine. Compared to the previous version of this patch, this one adds DART IOMMU support for the U4 chipset and thus should work fine on setups with more than 2Gb of RAM. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: export PCI fixup routinelinas
There is code in the RPAPHP directory that is identical to this routine; I'll be removing that code in an upcoming patch, but this patch is needed to expose the function to make it callable. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: Update MPIC workaroundsSegher Boessenkool
Cleanup the MPIC IO-APIC workarounds, make them a bit more generic, smaller and faster. Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: Remove device_node addrs/n_addrBenjamin Herrenschmidt
The pre-parsed addrs/n_addrs fields in struct device_node are finally gone. Remove the dodgy heuristics that did that parsing at boot and remove the fields themselves since we now have a good replacement with the new OF parsing code. This patch also fixes a bunch of drivers to use the new code instead, so that at least pmac32, pseries, iseries and g5 defconfigs build. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: Dont set 32bit cputable bits on 64bitAnton Blanchard
Milton and I were looking at the cputable code and it looks like we can set spurious bits on 64bit. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] ppc64: Add NUMA cpu summary at bootAnton Blanchard
We used to print a NUMA cpu summary at boot before the hotplug cpu code was added. This has been useful for catching machine configuration as well as firmware bugs in the past. This patch restores that functionality. An example of the output is: Node 0 CPUs: 0-7 Node 1 CPUs: 8-15 Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] ppc32: Add TQM85xx (8540/8541/8555/8560) board supportKumar Gala
This patch adds support for the TQ Components TQM85xx modules. Currently the modules TQM8540/8541/8555/8560 are supported. Signed-off-by: Stefan Roese <sr@denx.de> Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] spufs: Improved SPU preemptability [part 2].Arnd Bergmann
This patch reduces lock complexity of SPU scheduler, particularly for involuntary preemptive switches. As a result the new code does a better job of mapping the highest priority tasks to SPUs. Lock complexity is reduced by using the system default workqueue to perform involuntary saves. In this way we avoid nasty lock ordering problems that the previous code had. A "minimum timeslice" for SPU contexts is also introduced. The intent here is to avoid thrashing. While the new scheduler does a better job at prioritization it still does nothing for fairness. From: Mark Nutter <mnutter@us.ibm.com> Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] spufs: Improved SPU preemptability.Arnd Bergmann
This patch makes it easier to preempt an SPU context by having the scheduler hold ctx->state_sema for much shorter periods of time. As part of this restructuring, the control logic for the "run" operation is moved from arch/ppc64/kernel/spu_base.c to fs/spufs/file.c. Of course the base retains "bottom half" handlers for class{0,1} irqs. The new run loop will re-acquire an SPU if preempted. From: Mark Nutter <mnutter@us.ibm.com> Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: Add arch-dependent copy_oldmem_pageMichael Ellerman
Signed-off-by: Haren Myneni <haren@us.ibm.com> Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: Add arch dependent basic infrastructure for Kdump.Michael Ellerman
Implementing the machine_crash_shutdown which will be called by crash_kexec (called in case of a panic, sysrq etc.). Disable the interrupts, shootdown cpus using debugger IPI and collect regs for all CPUs. elfcorehdr= specifies the location of elf core header stored by the crashed kernel. This command line option will be passed by the kexec-tools to capture kernel. savemaxmem= specifies the actual memory size that the first kernel has and this value will be used for dumping in the capture kernel. This command line option will be passed by the kexec-tools to capture kernel. Signed-off-by: Haren Myneni <haren@us.ibm.com> Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: Fixups for kernel linked at 32 MBMichael Ellerman
There's a few places where we need to fix things up for the kernel to work if it's linked at 32MB: - platforms/powermac/smp.c To start secondary cpus on pmac we patch the reset vector, which is fine. Except if we're above 32MB we don't have enough bits for an absolute branch, it needs to relative. - kernel/head_64.s - A few branches in the cpu hold code need to load the full target address and do a bctr. - after_prom_start needs to load PHYSICAL_START as the dest address, not 0. - The exception prolog needs to load the low word of the target adddress, not just the low halfword. - Fixup handling of the initial stab address. - kernel/setup_64.c smp_release_cpus() needs to write 1 to the spinloop flag near 0, not 32 MB. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: Reroute interrupts from 0 + offset to PHYSICAL_START + offsetMichael Ellerman
Regardless of where the kernel's linked we always get interrupts at low addresses. This patch creates a trampoline in the first 3 pages of memory, where interrupts land, and patches those addresses to jump into the real kernel code at PHYSICAL_START. We also need to reserve the trampoline code and a bit more in prom.c Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: Create a trampoline for the fwnmi vectorsMichael Ellerman
The fwnmi vectors can be anywhere < 32 MB, so we need to use a trampoline for them. The kdump kernel will register the trampoline addresses, which will then jump up to the real code above 32 MB. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: Add CONFIG_CRASH_DUMPMichael Ellerman
This patch adds a Kconfig variable, CONFIG_CRASH_DUMP, which configures the built kernel for use as a Kdump kernel. Currently "all" this involves is changing the value of KERNELBASE to 32 MB. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: numa placement for dynamically added memoryMike Kravetz
This places dynamically added memory within the appropriate numa node. A new routine hot_add_scn_to_nid() replicates most of the memory scanning code in parse_numa_properties(). Signed-off-by: Mike Kravetz <kravetz@us.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: Separate usage of KERNELBASE and PAGE_OFFSETMichael Ellerman
This patch separates usage of KERNELBASE and PAGE_OFFSET. I haven't looked at any of the PPC32 code, if we ever want to support Kdump on PPC we'll have to do another audit, ditto for iSeries. This patch makes PAGE_OFFSET the constant, it'll always be 0xC * 1 gazillion for 64-bit. To get a physical address from a virtual one you subtract PAGE_OFFSET, _not_ KERNELBASE. KERNELBASE is the virtual address of the start of the kernel, it's often the same as PAGE_OFFSET, but _might not be_. If you want to know something's offset from the start of the kernel you should subtract KERNELBASE. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: Add a is_kernel_addr() macroMichael Ellerman
There's a bunch of code that compares an address with KERNELBASE to see if it's a "kernel address", ie. >= KERNELBASE. The proper test is actually to compare with PAGE_OFFSET, since we're going to change KERNELBASE soon. So replace all of them with an is_kernel_addr() macro that does that. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: Propagate regs through to machine_crash_shutdownMichael Ellerman
Currently machine_crash_shutdown() gets a struct pt_regs, but doesn't pass it through to the ppc_md function, it should. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: Update OF address parsersBenjamin Herrenschmidt
This updates the OF address parsers to return the IO flags indicating the type of address obtained. It also adds a PCI call for converting physical addresses that hit IO space into into IO tokens, and add routines that return the translated addresses into struct resource Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: udbg updatesBenjamin Herrenschmidt
The udbg low level io layer has an issue with udbg_getc() returning a char (unsigned on ppc) instead of an int, thus the -1 if you had no available input device could end up turned into 0xff, filling your display with bogus characters. This fixes it, along with adding a little blob to xmon to do a delay before exiting when getting an EOF and fixing the detection of ADB keyboards in udbg_adb.c Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: migrate common PCI hotplug codeLinas Vepstas
23-rpaphp-migrate.patch (parts) This patch moves some pci device add & remove code from the PCI hotplug directory to the arch/powerpc/kernel directory, and cleans it up a tad. The primary reason for this is that the code performs some fairly generic operations that are shared with the PCI error recovery code (living in the arch/powerpc/kernel directory). Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: make pcibios_claim_one_bus available to other codeLinas Vepstas
22-rpaphp-eliminate-dupe-code.patch (parts) The RPAPHP code contains two routines that appear to be gratuitous copies of very similar pci code. In particular, rpaphp_claim_resource ~~ pci_claim_resource rpadlpar_claim_one_bus == pcibios_claim_one_bus This makes pcibios_claim_one_bus from arch/powerpc/kernel/pci_64.c available to the RPAPHP code. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: PCI hotplug common code eliminationLinas Vepstas
20-rpaphp-eeh-cleanup.patch This patch move some code from the rpaphp directory, to the powerpc directory, where it should have been all along (Among other things, I need it in the powerpc directory for the PCI error recovery.) Please note that patch affects TWO maintainers: Paul, after applying the powerpc part, please ask that GregKH appli the PCI part. It is safe to have the powerpc part go in first. It would be bad to have the PCI part go in first. Signed-off-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: Remove some unneeded fields from the pacaDavid Gibson
This patch removes several unnecessary fields from the paca: - next_jiffy_update_tb was simply unused. Remove trivially. - The exdsi exception save area was not used. There were plans to use it, but they never seem to have gone anywhere. If they ever do, we can put it back. Remove from the paca, and from asm-offsets.c - The default_decr field was used from asm, but was only ever assigned the value of tb_ticks_per_jiffy. Just access tb_ticks_per_jiffy from asm directly instead. Built and booted on POWER5 LPAR and iSeries RS64. Signed-off-by: David Gibson <dwg@au1.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: Remove ItLpRegSave area from the pacaDavid Gibson
On iSeries, the paca contains, amongst other things an ItLpRegSave structure used by the hypervisor to save registers. The hypervisor locates this area through a pointer at the beginning of the paca, so the structure itself can be located elsewhere. This patch moves the reg_save area out into its own array. This reduces the amount of iSeries specific gunk which is visible to general powerpc code via paca.h Built and booted on POWER5 LPAR and iSeries RS64. Signed-off-by: David Gibson <dwg@au1.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: Add back support for booting from BootX (#2)Benjamin Herrenschmidt
ARCH=powerpc couldn't boot from BootX as it uses a "different" way of getting in the kernel. This patch adds the necessary trampolines, creating a flattened device-tree from the tree passed from MacOS, and initializing the btext engine early for really-early debugging. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: Unify udbg (#2)Benjamin Herrenschmidt
This patch unifies udbg for both ppc32 and ppc64 when building the merged achitecture. xmon now has a single "back end". The powermac udbg stuff gets enriched with some ADB capabilities and btext output. In addition, the early_init callback is now called on ppc32 as well, approx. in the same order as ppc64 regarding device-tree manipulations. The init sequences of ppc32 and ppc64 are getting closer, I'll unify them in a later patch. For now, you can force udbg to the scc using "sccdbg" or to btext using "btextdbg" on powermacs. I'll implement a cleaner way of forcing udbg output to something else than the autodetected OF output device in a later patch. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: serial port discovery (#2)Benjamin Herrenschmidt
This moves the discovery of legacy serial ports to a separate file, makes it common to ppc32 and ppc64, and reworks it to use the new OF address translators to get to the ports early. This new version can also detect some PCI serial cards using legacy chips and will probably match those discovered port with the default console choice. Only ppc64 gets udbg still yet, unifying udbg isn't finished yet. It also adds some speed-probing code to udbg so that the default console can come up at the same speed it was set to by the firmware. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: Add OF address parsing code (#2)Benjamin Herrenschmidt
Parsing addresses extracted from Open Firmware isn't a simple matter. We have various bits of code that try to do it in various place, including some heuristics in prom.c that pre-parse addresses at boot and fill device-nodes "addrs", but those are dodgy at best and I want to deprecate them. So this patch introduces a new set of routines that should be capable of parsing most types of addresses and translating them into CPU physical addresses. It currently works for things on PCI busses and ISA busses and should work on "standard" busses like the root bus or the MacIO bus that don't put funky flags in addresses. If you have other bus types that do use funky flags, you'll have to add new bus type translators, which is fairly easy. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09ppc: remove duplicate bseip.hPaul Mackerras
include/asm-ppc/bseip.h is a duplicate of arch/ppc/platforms/bseip.h and is not referenced anywhere, so get rid of it. Pointed out by Marcelo Tosatti. Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09powerpc: Update __NR_syscalls to account for SPU syscallsPaul Mackerras
A previous patch ended up not increasing __NR_syscalls to account for the new SPU syscalls (probably my fault). Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] spufs: cooperative scheduler supportArnd Bergmann
This adds a scheduler for SPUs to make it possible to use more logical SPUs than physical ones are present in the system. Currently, there is no support for preempting a running SPU thread, they have to leave the SPU by either triggering an event on the SPU that causes it to return to the owning thread or by sending a signal to it. This patch also adds operations that enable accessing an SPU in either runnable or saved state. We use an RW semaphore to protect the state of the SPU from changing underneath us, while we are holding it readable. In order to change the state, it is acquired writeable and a context save or restore is executed before downgrading the semaphore to read-only. From: Mark Nutter <mnutter@us.ibm.com>, Uli Weigand <Ulrich.Weigand@de.ibm.com> Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] kernel-side context switch code for spufsMark Nutter
This adds the code needed to perform a context switch from spufs, following the recommended 76-step sequence. Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] spufs: switchable spu contextsMark Nutter
Add some infrastructure for saving and restoring the context of an SPE. This patch creates a new structure that can hold the whole state of a physical SPE in memory. It also contains code that avoids races during the context switch and the binary code that is loaded to the SPU in order to access its registers. The actual PPE- and SPE-side context switch code are two separate patches. Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] spufs: The SPU file system, baseArnd Bergmann
This is the current version of the spu file system, used for driving SPEs on the Cell Broadband Engine. This release is almost identical to the version for the 2.6.14 kernel posted earlier, which is available as part of the Cell BE Linux distribution from http://www.bsc.es/projects/deepcomputing/linuxoncell/. The first patch provides all the interfaces for running spu application, but does not have any support for debugging SPU tasks or for scheduling. Both these functionalities are added in the subsequent patches. See Documentation/filesystems/spufs.txt on how to use spufs. Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] powerpc: IBMEBUS bus supportHeiko J Schick
This patch adds the necessary core bus support used by device drivers that sit on the IBM GX bus on modern pSeries machines like the Galaxy infiniband for example. It provide transparent DMA ops (the low level driver works with virtual addresses directly) along with a simple bus layer using the Open Firmware matching routines. Signed-off-by: Heiko J Schick <schickhj@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09[PATCH] syscall entry/exit revampDavid Woodhouse
This cleanup patch speeds up the null syscall path on ppc64 by about 3%, and brings the ppc32 and ppc64 code slightly closer together. The ppc64 code was checking current_thread_info()->flags twice in the syscall exit path; once for TIF_SYSCALL_T_OR_A before disabling interrupts, and then again for TIF_SIGPENDING|TIF_NEED_RESCHED etc after disabling interrupts. Now we do the same as ppc32 -- check the flags only once in the fast path, and re-enable interrupts if necessary in the ptrace case. The patch abolishes the 'syscall_noerror' member of struct thread_info and replaces it with a TIF_NOERROR bit in the flags, which is handled in the slow path. This shortens the syscall entry code, which no longer needs to clear syscall_noerror. The patch adds a TIF_SAVE_NVGPRS flag which causes the syscall exit slow path to save the non-volatile GPRs into a signal frame. This removes the need for the assembly wrappers around sys_sigsuspend(), sys_rt_sigsuspend(), et al which existed solely to save those registers in advance. It also means I don't have to add new wrappers for ppoll() and pselect(), which is what I was supposed to be doing when I got distracted into this... Finally, it unifies the ppc64 and ppc32 methods of handling syscall exit directly into a signal handler (as required by sigsuspend et al) by introducing a TIF_RESTOREALL flag which causes _all_ the registers to be reloaded from the pt_regs by taking the ret_from_exception path, instead of the normal syscall exit path which stomps on the callee-saved GPRs. It appears to pass an LTP test run on ppc64, and passes basic testing on ppc32 too. Brief tests of ptrace functionality with strace and gdb also appear OK. I wouldn't send it to Linus for 2.6.15 just yet though :) Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Paul Mackerras <paulus@samba.org>