aboutsummaryrefslogtreecommitdiff
path: root/include/asm-s390
AgeCommit message (Collapse)Author
2006-10-04[S390] Remove open-coded mem_map usage.Heiko Carstens
Use page_to_phys and pfn_to_page to avoid open-coded mem_map usage. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2006-10-04[S390] incorrect placement of include.Martin Schwidefsky
The include of linux/smp.h needs to be done before the #if that checks for the compiler version. Seems like fallout from the inline assembly cleanup patch vs. the directed yield patch. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-10-04[S390] Wire up sys_getcpu system call.Heiko Carstens
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2006-10-03fix file specification in commentsUwe Zeisberger
Many files include the filename at the beginning, serveral used a wrong one. Signed-off-by: Uwe Zeisberger <Uwe_Zeisberger@digi.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-10-02[PATCH] remove remaining errno and __KERNEL_SYSCALLS__ referencesArnd Bergmann
The last in-kernel user of errno is gone, so we should remove the definition and everything referring to it. This also removes the now-unused lib/execve.c file that was introduced earlier. Also remove every trace of __KERNEL_SYSCALLS__ that still remained in the kernel. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Andi Kleen <ak@muc.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Hirokazu Takata <takata.hirokazu@renesas.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Richard Curnow <rc@rc0.org.uk> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp> Cc: Chris Zankel <chris@zankel.net> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] Add regs_return_value() helperAnanth N Mavinakayanahalli
Add the regs_return_value() macro to extract the return value in an architecture agnostic manner, given the pt_regs. Other architecture maintainers may want to add similar helpers. Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01[PATCH] Directed yield: direct yield of spinlocks for s390.Martin Schwidefsky
Use the new diagnose 0x9c in the spinlock implementation for s390. It yields the remaining timeslice of the virtual cpu that tries to acquire a lock to the virtual cpu that is the current holder of the lock. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01[PATCH] Directed yield: cpu_relax variants for spinlocks and rw-locksMartin Schwidefsky
On systems running with virtual cpus there is optimization potential in regard to spinlocks and rw-locks. If the virtual cpu that has taken a lock is known to a cpu that wants to acquire the same lock it is beneficial to yield the timeslice of the virtual cpu in favour of the cpu that has the lock (directed yield). With CONFIG_PREEMPT="n" this can be implemented by the architecture without common code changes. Powerpc already does this. With CONFIG_PREEMPT="y" the lock loops are coded with _raw_spin_trylock, _raw_read_trylock and _raw_write_trylock in kernel/spinlock.c. If the lock could not be taken cpu_relax is called. A directed yield is not possible because cpu_relax doesn't know anything about the lock. To be able to yield the lock in favour of the current lock holder variants of cpu_relax for spinlocks and rw-locks are needed. The new _raw_spin_relax, _raw_read_relax and _raw_write_relax primitives differ from cpu_relax insofar that they have an argument: a pointer to the lock structure. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Haavard Skinnemoen <hskinnemoen@atmel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] Make touch_nmi_watchdog imply touch_softlockup_watchdog on all archsMichal Schmidt
touch_nmi_watchdog() calls touch_softlockup_watchdog() on both architectures that implement it (i386 and x86_64). On other architectures it does nothing at all. touch_nmi_watchdog() should imply touch_softlockup_watchdog() on all architectures. Suggested by Andi Kleen. [heiko.carstens@de.ibm.com: s390 fix] Signed-off-by: Michal Schmidt <xschmi00@stud.feec.vutbr.cz> Cc: Andi Kleen <ak@muc.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Michal Schmidt <xschmi00@stud.feec.vutbr.cz> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29[PATCH] convert s390 page handling macros to functionsHeiko Carstens
Convert s390 page handling macros to functions. In particular this fixes a problem with s390's SetPageUptodate macro which uses its input parameter twice which again can cause subtle bugs. [akpm@osdl.org: build fix] Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-28[S390] Inline assembly cleanup.Martin Schwidefsky
Major cleanup of all s390 inline assemblies. They now have a common coding style. Quite a few have been shortened, mainly by using register asm variables. Use of the EX_TABLE macro helps as well. The atomic ops, bit ops and locking inlines new use the Q-constraint if a newer gcc is used. That results in slightly better code. Thanks to Christian Borntraeger for proof reading the changes. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-09-28[S390] __div64_32 for 31 bit.Martin Schwidefsky
The clocksource infrastructure introduced with commit ad596171ed635c51a9eef829187af100cbf8dcf7 broke 31 bit s390. The reason is that the do_div() primitive for 31 bit always had a restriction: it could only divide an unsigned 64 bit integer by an unsigned 31 bit integer. The clocksource code now uses do_div() with a base value that has the most significant bit set. The result is that clock->cycle_interval has a funny value which causes the linux time to jump around like mad. The solution is "obvious": implement a proper __div64_32 function for 31 bit s390. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-09-27[PATCH] consistently use MAX_ERRNO in __syscall_returnRandy Dunlap
Consistently use MAX_ERRNO when checking for errors in __syscall_return(). [ralf@linux-mips.org: build fix] Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26[PATCH] Standardize pxx_page macrosDave McCracken
One of the changes necessary for shared page tables is to standardize the pxx_page macros. pte_page and pmd_page have always returned the struct page associated with their entry, while pte_page_kernel and pmd_page_kernel have returned the kernel virtual address. pud_page and pgd_page, on the other hand, return the kernel virtual address. Shared page tables needs pud_page and pgd_page to return the actual page structures. There are very few actual users of these functions, so it is simple to standardize their usage. Since this is basic cleanup, I am submitting these changes as a standalone patch. Per Hugh Dickins' comments about it, I am also changing the pxx_page_kernel macros to pxx_page_vaddr to clarify their meaning. Signed-off-by: Dave McCracken <dmccr@us.ibm.com> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26[PATCH] bootmem: use MAX_DMA_ADDRESS instead of LOW32LIMITHeiko Carstens
Introduce ARCH_LOW_ADDRESS_LIMIT which can be set per architecture to override the 4GB default limit used by the bootmem allocater within __alloc_bootmem_low() and __alloc_bootmem_low_node(). E.g. s390 needs a 2GB limit instead of 4GB. Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26[PATCH] trigger a syntax error if percpu macros are incorrectly usedJan Blunck
get_cpu_var()/per_cpu()/__get_cpu_var() arguments must be simple identifiers. Otherwise the arch dependent implementations might break. This patch enforces the correct usage of the macros by producing a syntax error if the variable is not a simple identifier. Signed-off-by: Jan Blunck <jblunck@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-24[S390] Unexport <asm/z90crypt.h>, export <asm/zcrypt.h> in its place.David Woodhouse
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-09-22Merge git://git.infradead.org/~dwmw2/hdronelineLinus Torvalds
* git://git.infradead.org/~dwmw2/hdroneline: [HEADERS] One line per header in Kbuild files to reduce conflicts Manual (trivial) conflict resolution in include/asm-s390/Kbuild
2006-09-20[S390] Use alternative user-copy operations for new hardware.Gerald Schaefer
This introduces new user-copy operations which are optimized for copying more than 256 Bytes on new hardware. Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-09-20[S390] Make user-copy operations run-time configurable.Gerald Schaefer
Introduces a struct uaccess_ops which allows setting user-copy operations at run-time. Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-09-20[S390] Cleanup in page table related code.Gerald Schaefer
Changed and simplified some page table related #defines and code. Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-09-20[S390] Linux API for writing z/VM APPLDATA Monitor records.Melissa Howland
This patch delivers a new Linux API in the form of a misc char device that is useable from user space and allows write access to the z/VM APPLDATA Monitor Records collected by the *MONITOR System Service of z/VM. Signed-off-by: Melissa Howland <melissah@us.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-09-20[S390] cleanup appldata.Gerald Schaefer
Introduce asm header that contains the appldata data structures and the diag inline assembly. Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-09-20[S390] convert some assembler to C.Heiko Carstens
Convert GET_IPL_DEVICE assembler macro to C function. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-09-20[S390] #undef in unistd.hMartin Schwidefsky
Avoid using #undef in unistd.h. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-09-20[S390] empty function defines.Heiko Carstens
Use do { } while (0) constructs instead of empty defines to avoid subtle compile bugs. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-09-20[S390] ipl/dump on panic.Michael Holzheu
It is now possible to specify a ccw/fcp dump device which is used to automatically create a system dump in case of a kernel panic. The dump device can be configured under /sys/firmware/dump. In addition it is now possible to specify a ccw/fcp device which is used for the next reboot of Linux. The reipl device can be configured under /sys/firmware/reipl. Signed-off-by: Michael Holzheu <holzheu@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-09-20[S390] initrd vs. bootmem bitmap.Heiko Carstens
Move initrd if the bitmap of the bootmem allocator would overwrite it. In addition this patch sets the default size and address of the initrd to 0. Therefore all boot loaders must set the initrd size and address correctly. This is especially relevant for ftp boot via HMC/SE, where this change requires a special patch file entry in the .ins file which sets these two values contained at address 0x10408 and 0x10410. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-09-20[S390] add kprobes support.Michael Grundy
Signed-off-by: Michael Grundy <grundym@us.ibm.com> Signed-off-by: David Wilder <dwilder@us.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-09-20[S390] zcrypt secure key cryptography extension.Ralph Wuerthner
Allow the user space to send extended cprb messages directly to the PCIXCC / CEX2C cards. This allows the CCA library to construct special crypto requests that use "secure" keys that are stored on the card. Signed-off-by: Ralph Wuerthner <rwuerthn@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-09-20[S390] zcrypt user space interface.Martin Schwidefsky
The user space interface of the zcrypt device driver implements the old user space interface as defined by the old z90crypt driver. Everything is there, the /dev/z90crypt misc character device, all the lovely ioctls and the /proc file. Even writing to the z90crypt proc file to configure the crypto device still works. It stands to reason to remove the proc write function someday since a much cleaner configuration via the sysfs is now available. The ap bus device drivers register crypto cards to the zcrypt user space interface. The request router of the user space interface picks one of the registered cards based on the predicted latency for the request and calls the driver via a callback found in the zcrypt_ops of the device. The request router only knows which operations the card can do and the minimum / maximum number of bits a request can have. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Ralph Wuerthner <rwuerthn@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-09-20[S390] remove old z90crypt driver.Martin Schwidefsky
The z90crypt driver has served its term. It is replaced by the shiny new zcrypt device driver. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-09-20[S390] EX_TABLE macro.Martin Schwidefsky
Add EX_TABLE helper macro to simplify creation of inline assembly exception table entries. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-09-19[HEADERS] One line per header in Kbuild files to reduce conflictsDavid Woodhouse
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-09-16[PATCH] Fix 'make headers_check' on s390David Woodhouse
On Tue, 2006-09-12 at 17:44 +0100, David Woodhouse wrote: > asm-s390/debug.h requires linux/string.h, which does not exist > asm-s390/elf.h requires asm/system.h, which does not exist Move things around slightly so the right things end up within #ifdef __KERNEL__ and thus don't pollute the exported headers. Signed-off-by: David Woodhouse <dwmw2@infradead.org> Cc: Sam Ravnborg <sam@ravnborg.org> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-17[S390] get_clock inline assembly.Andreas Krebbel
Add missing volatile to the get_clock / get_cycles inline assemblies to avoid that consecutive calls get optimized away. Signed-off-by: Andreas Krebbel <krebbel1@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-07-17[S390] Fix gcc warning about unused return values.Heiko Carstens
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-07-14[PATCH] remove set_wmb - arch removalSteven Rostedt
set_wmb should not be used in the kernel because it just confuses the code more and has no benefit. Since it is not currently used in the kernel this patch removes it so that new code does not include it. All archs define set_wmb(var, value) to do { var = value; wmb(); } while(0) except ia64 and sparc which use a mb() instead. But this is still moot since it is not used anyway. Hasn't been tested on any archs but x86 and x86_64 (and only compiled tested) Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-12[S390] Fix sparse warnings.Heiko Carstens
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-07-12[S390] path grouping and path verifications fixes.Cornelia Huck
1. Multipath devices for which SetPGID is not supported are not handled well. Use NOP ccws for path verification (sans path grouping) when SetPGID is not supported. 2. Check for PGIDs already set with SensePGID on _all_ paths (not just the first one) and try to find a common one. Moan if no common PGID can be found (and use NOP verification). If no PGIDs have been set, use the css global PGID (as before). (Rationale: SetPGID will get a command reject if the PGID it tries to set does not match the already set PGID.) 3. Immediately before reboot, issue RESET CHANNEL PATH (rcp) on all chpids. This will remove the old PGIDs. rcp will generate solicited CRWs which can be savely ignored by the machine check handler (all other actions create unsolicited CRWs). Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-07-12[S390] cpu_relax() is supposed to have barrier() semantics.Heiko Carstens
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-07-12[S390] fix futex_atomic_cmpxchg_inatomicMartin Schwidefsky
futex_atomic_cmpxchg_inatomic has the same bug as the other atomic futex operations: the operation needs to be done in the user address space, not the kernel address space. Add the missing sacf 256 & sacf 0. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-07-12[S390] raw_local_save_flags/raw_local_irq_restore type checkHeiko Carstens
Make sure that raw_local_save_flags and raw_local_irq_restore always get an unsigned long parameter. raw_irqs_disabled should call raw_local_save_flags instead of local_save_flags. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-07-12[S390] __builtin_trap() and gcc version.Heiko Carstens
__builtin_trap() has the archictecture defined backend in gcc since gcc 3.3. To make sure the kernel builds with gcc 3.2 as well, use the old style BUG() statement if compiled with older gcc versions. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-07-04Merge git://git.infradead.org/hdrinstall-2.6Linus Torvalds
* git://git.infradead.org/hdrinstall-2.6: Remove export of include/linux/isdn/tpam.h Remove <linux/i2c-id.h> and <linux/i2c-algo-ite.h> from userspace export Restrict headers exported to userspace for SPARC and SPARC64 Add empty Kbuild files for 'make headers_install' in remaining arches. Add Kbuild file for Alpha 'make headers_install' Add Kbuild file for SPARC 'make headers_install' Add Kbuild file for IA64 'make headers_install' Add Kbuild file for S390 'make headers_install' Add Kbuild file for i386 'make headers_install' Add Kbuild file for x86_64 'make headers_install' Add Kbuild file for PowerPC 'make headers_install' Add generic Kbuild files for 'make headers_install' Basic implementation of 'make headers_check' Basic implementation of 'make headers_install'
2006-07-03[PATCH] lockdep: prove rwsem locking correctnessIngo Molnar
Use the lock validator framework to prove rwsem locking correctness. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] lockdep: irqtrace subsystem, s390 supportHeiko Carstens
irqtrace support for s390. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] lockdep: add per_cpu_offset()Ingo Molnar
Add the per_cpu_offset() generic method. (used by the lock validator) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-02[PATCH] irq-flags: S390: Use the new IRQF_ constantsThomas Gleixner
Use the new IRQF_ constants and remove the SA_INTERRUPT define Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: "David S. Miller" <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30[PATCH] zoned vm counters: create vmstat.c/.h from page_alloc.c/.hChristoph Lameter
NOTE: ZVC are *not* the lightweight event counters. ZVCs are reliable whereas event counters do not need to be. Zone based VM statistics are necessary to be able to determine what the state of memory in one zone is. In a NUMA system this can be helpful for local reclaim and other memory optimizations that may be able to shift VM load in order to get more balanced memory use. It is also useful to know how the computing load affects the memory allocations on various zones. This patchset allows the retrieval of that data from userspace. The patchset introduces a framework for counters that is a cross between the existing page_stats --which are simply global counters split per cpu-- and the approach of deferred incremental updates implemented for nr_pagecache. Small per cpu 8 bit counters are added to struct zone. If the counter exceeds certain thresholds then the counters are accumulated in an array of atomic_long in the zone and in a global array that sums up all zone values. The small 8 bit counters are next to the per cpu page pointers and so they will be in high in the cpu cache when pages are allocated and freed. Access to VM counter information for a zone and for the whole machine is then possible by simply indexing an array (Thanks to Nick Piggin for pointing out that approach). The access to the total number of pages of various types does no longer require the summing up of all per cpu counters. Benefits of this patchset right now: - Ability for UP and SMP configuration to determine how memory is balanced between the DMA, NORMAL and HIGHMEM zones. - loops over all processors are avoided in writeback and reclaim paths. We can avoid caching the writeback information because the needed information is directly accessible. - Special handling for nr_pagecache removed. - zone_reclaim_interval vanishes since VM stats can now determine when it is worth to do local reclaim. - Fast inline per node page state determination. - Accurate counters in /sys/devices/system/node/node*/meminfo. Current counters are counting simply which processor allocated a page somewhere and guestimate based on that. So the counters were not useful to show the actual distribution of page use on a specific zone. - The swap_prefetch patch requires per node statistics in order to figure out when processors of a node can prefetch. This patch provides some of the needed numbers. - Detailed VM counters available in more /proc and /sys status files. References to earlier discussions: V1 http://marc.theaimsgroup.com/?l=linux-kernel&m=113511649910826&w=2 V2 http://marc.theaimsgroup.com/?l=linux-kernel&m=114980851924230&w=2 V3 http://marc.theaimsgroup.com/?l=linux-kernel&m=115014697910351&w=2 V4 http://marc.theaimsgroup.com/?l=linux-kernel&m=115024767318740&w=2 Performance tests with AIM7 did not show any regressions. Seems to be a tad faster even. Tested on ia64/NUMA. Builds fine on i386, SMP / UP. Includes fixes for s390/arm/uml arch code. This patch: Move counter code from page_alloc.c/page-flags.h to vmstat.c/h. Create vmstat.c/vmstat.h by separating the counter code and the proc functions. Move the vm_stat_text array before zoneinfo_show. [akpm@osdl.org: s390 build fix] [akpm@osdl.org: HOTPLUG_CPU build fix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>