aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2006-03-23[PATCH] swsusp: add check for suspension of X-controlled devicesRafael J. Wysocki
It is unsafe to suspend devices if the hardware is controlled by X. Add an extra check to prevent this from happening. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23[PATCH] kernel/power: move externs to header filesRandy Dunlap
Move externs from C source files to header files. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23[PATCH] swsusp: low level interfaceRafael J. Wysocki
Introduce the low level interface that can be used for handling the snapshot of the system memory by the in-kernel swap-writing/reading code of swsusp and the userland interface code (to be introduced shortly). Also change the way in which swsusp records the allocated swap pages and, consequently, simplifies the in-kernel swap-writing/reading code (this is necessary for the userland interface too). To this end, it introduces two helper functions in mm/swapfile.c, so that the swsusp code does not refer directly to the swap internals. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23[PATCH] x86: Make _syscallX() macros compile in PIC modeMarkus Gutschke
Gcc reserves %ebx when compiling position-independent-code on i386. This means, the _syscallX() macros in include/asm-i386/unistd.h will not compile. This patch is changes the existing macros to take special care to preserve %ebx. The bug can be tracked at http://bugzilla.kernel.org/show_bug.cgi?id=6204 Signed-off-by: Markus Gutschke <markus@google.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23[PATCH] i386 spinlocks: disable interrupts only if we enabled themChuck Ebbert
_raw_spin_lock_flags() is entered with interrupts disabled. If it cannot obtain a spinlock, it checks the flags that were passed and re-enables interrupts before spinning if that's how the flags are set. When the spinlock might be available, it disables interrupts (even if they are already disabled) before trying to get the lock. Change that so interrupts are only disabled if they have been enabled. This costs nine bytes of duplicated spinloop code. Fastpath before patch: jle <keep looping> not-taken conditional jump cli disable interrupts jmp <try for lock> unconditional jump Fastpath after patch, if interrupts were not enabled: jg <try for lock> taken conditional branch Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23[PATCH] Fix the imlicit declaration of mtrr_centaur_report_mcr in ↵Jesper Juhl
arch/i386/kernel/cpu/centaur.c arch/i386/kernel/cpu/centaur.c: In function `centaur_mcr_insert': arch/i386/kernel/cpu/centaur.c:33: warning: implicit declaration of function `mtrr_centaur_report_mcr' Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23[PATCH] i386: fix uses of user_mode() vs. user_mode_vm()Jan Beulich
>commit 76381fee7e8feb4c22be636aa5d4765dbe4fbf9e >Author: Vincent Hanquez <vincent.hanquez@cl.cam.ac.uk> >Date: Thu Jun 23 00:08:46 2005 -0700 > > [PATCH] xen: x86_64: use more usermode macro > > Make use of the user_mode macro where it's possible. This is useful for Xen > because it will need only to redefine only the macro to a hypervisor call. I am of the opinion that the above changeset is incomplete, i.e. it missed converting some previous uses of user_mode to user_mode_vm. While most of them could be considered just cosmetical, at least the one in die_nmi doesn't appear to be. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Vincent Hanquez <vincent.hanquez@cl.cam.ac.uk> Cc: Zachary Amsden <zach@vmware.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23[PATCH] i386: actively synchronize vmalloc area when registering certain ↵Jan Beulich
callbacks Registering a callback handler through register_die_notifier() is obviously primarily intended for use by modules. However, the way these currently get called it is basically impossible for them to actually be used by modules, as there is, on non-PAE configurationes, a good chance (the larger the module, the better) for the system to crash as a result. This is because the callback gets invoked (a) in the page fault path before the top level page table propagation gets carried out (hence a fault to propagate the top level page table entry/entries mapping to module's code/data would nest infinitly) and (b) in the NMI path, where nested faults must absolutely not happen, since otherwise the IRET from the nested fault re-enables NMIs, potentially resulting in nested NMI occurences. Besides the modular aspect, similar problems would even arise for in- kernel consumers of the API if they touched ioremap()ed or vmalloc()ed memory inside their handlers. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23[PATCH] x86: early printk handling fixesStas Sergeev
The history is that -mm kernels do not work for me for a few months already. The things started from crashing somewhere after starting init, and for the last month - no boot at all, just "Uncompressing... OK, booting kernel", and silence. Early console didn't work too. With the latest releases this degraded into an infinite stream of the "Unknown interrupt or fault" messages. So today my patience ran out and I started to think how can I collect at least some info for the bug-report. Attached is the patch that allows to gather some valueable debug info on the problem by making an early console more useable. I can't properly test the patch, as the kernel still doesn't boot, so I'll explain it in details in a hope someone else can justify the intrusive changes. arch_hooks.h: added prototypes for setup_early_printk() and early_printk(). setup.c: killed wrong setup_early_printk() prototype. Moved setup_early_printk() a bit earlier, as it was not "early enough" to cover the bug I was fighting with. early_printk.c: made it to start printing from the bottom of the screen, otherwise the messages interfere with the ones of the boot-loader, so you can't read them. Signed-off-by: Stas Sergeev <stsp@aknet.ru> Cc: Andi Kleen <ak@muc.de> Cc: Zwane Mwaikambo <zwane@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23[PATCH] i386: remove duplicate declaration of mp_bus_id_to_pci_busChris Wright
mp_bus_id_to_pci_bus is declared identically twice. Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23[PATCH] Compilation fix for ES7000 when no ACPI is specified in config (i386)Natalie.Protasevich@unisys.com
ES7000 platform code clean up for compilation errors and a warning. Ifdef'd the ACPI related parts in the ES7000 platform code. They were causing compile errors in certain configuration (without ACPI defined). I think this approach would be best (as opposed to Kconfig changes) since it only touches the subarch... Signed-off-by: <Natalie.Protasevich@unisys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23[PATCH] i386: Add a temporary to make put_user more type safeEric W. Biederman
In some code I am developing I had occasion to change the type of a variable. This made the value put_user was putting to user space wrong. But the code continued to build cleanly without errors. Introducing a temporary fixes this problem and at least with gcc-3.3.5 does not cause gcc any problems with optimizing out the temporary. gcc-4.x using SSA internally ought to be even better at optimizing out temporaries, so I don't expect a temporary to become a problem. Especially because in all correct cases the types on both sides of the assignment to the temporary are the same. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23[PATCH] x86: SMP alternativesGerd Hoffmann
Implement SMP alternatives, i.e. switching at runtime between different code versions for UP and SMP. The code can patch both SMP->UP and UP->SMP. The UP->SMP case is useful for CPU hotplug. With CONFIG_CPU_HOTPLUG enabled the code switches to UP at boot time and when the number of CPUs goes down to 1, and switches to SMP when the number of CPUs goes up to 2. Without CONFIG_CPU_HOTPLUG or on non-SMP-capable systems the code is patched once at boot time (if needed) and the tables are released afterwards. The changes in detail: * The current alternatives bits are moved to a separate file, the SMP alternatives code is added there. * The patch adds some new elf sections to the kernel: .smp_altinstructions like .altinstructions, also contains a list of alt_instr structs. .smp_altinstr_replacement like .altinstr_replacement, but also has some space to save original instruction before replaving it. .smp_locks list of pointers to lock prefixes which can be nop'ed out on UP. The first two are used to replace more complex instruction sequences such as spinlocks and semaphores. It would be possible to deal with the lock prefixes with that as well, but by handling them as special case the table sizes become much smaller. * The sections are page-aligned and padded up to page size, so they can be free if they are not needed. * Splitted the code to release init pages to a separate function and use it to release the elf sections if they are unused. Signed-off-by: Gerd Hoffmann <kraxel@suse.de> Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23[PATCH] DM: Fix bug: BIO_RW_BARRIER requests to md/raid1 hang.NeilBrown
Both R1BIO_Barrier and R1BIO_Returned are 4 !!!! This means that barrier requests don't get returned (i.e. b_endio called) because it looks like they already have been. Signed-off-by: Neil Brown <neilb@suse.de> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpcLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (78 commits) [PATCH] powerpc: Add FSL SEC node to documentation [PATCH] macintosh: tidy-up driver_register() return values [PATCH] powerpc: tidy-up of_register_driver()/driver_register() return values [PATCH] powerpc: via-pmu warning fix [PATCH] macintosh: cleanup the use of i2c headers [PATCH] powerpc: dont allow old RTC to be selected [PATCH] powerpc: make powerbook_sleep_grackle static [PATCH] powerpc: Fix warning in add_memory [PATCH] powerpc: update mailing list addresses [PATCH] powerpc: Remove calculation of io hole [PATCH] powerpc: iseries: Add bootargs to /chosen [PATCH] powerpc: iseries: Add /system-id, /model and /compatible [PATCH] powerpc: Add strne2a() to convert a string from EBCDIC to ASCII [PATCH] powerpc: iseries: Make more stuff static in platforms/iseries/mf.c [PATCH] powerpc: iseries: Remove pointless iSeries_(restart|power_off|halt) [PATCH] powerpc: iseries: mf related cleanups [PATCH] powerpc: Replace platform_is_lpar() with a firmware feature [PATCH] powerpc: trivial: Cleanup whitespace in cputable.h [PATCH] powerpc: Remove unused iommu_off logic from pSeries_init_early() [PATCH] powerpc: Unconfuse htab_bolt_mapping() callers ...
2006-03-22Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [TCP]: Do not use inet->id of global tcp_socket when sending RST. [NETFILTER]: Fix undefined references to get_h225_addr [NETFILTER]: futher {ip,ip6,arp}_tables unification [NETFILTER]: Fix xt_policy address matching [NETFILTER]: nf_conntrack: support for layer 3 protocol load on demand [NETFILTER]: x_tables: set the protocol family in x_tables targets/matches [NETFILTER]: conntrack: cleanup the conntrack ID initialization [NETFILTER]: nfnetlink_queue: fix nfnetlink message size [NETFILTER]: ctnetlink: Fix expectaction mask dumping [NETFILTER]: Fix Kconfig typos [NETFILTER]: Fix ip6tables breakage from {get,set}sockopt compat layer
2006-03-22Merge master.kernel.org:/home/rmk/linux-2.6-serialLinus Torvalds
* master.kernel.org:/home/rmk/linux-2.6-serial: [SERIAL] Merge avlab serial board entries in parport_serial [SERIAL] kernel console should send CRLF not LFCR
2006-03-22Merge master.kernel.org:/home/rmk/linux-2.6-armLinus Torvalds
* master.kernel.org:/home/rmk/linux-2.6-arm: (45 commits) [ARM] 3389/1: typo and grammar fix [ARM] 3386/1: AT91RM9200 Clock update [ARM] 3384/1: AT91RM9200: Timer [ARM] 3382/1: ixp2000: unify defconfigs [ARM] 3381/1: ixp2000: fix slowport write timing control register fields [ARM] 3380/1: ixp2000: simplify ixdp2x00_master_npu() check [ARM] 3379/1: ixp2000: use generic 8250 debug macros [ARM] 3378/1: ixp2000: fix gpio interrupt handling [ARM] Quieten spurious IRQ detection [ARM] Use kcalloc to allocate counter_config array rather than kmalloc [ARM] Oprofile: dynamically allocate counter_config [ARM] Oprofile: Convert semaphore to mutex [ARM] 3376/2: S3C2410 - update defconfig [ARM] 3375/1: S3C2440 - fix osiris machine build [ARM] 3374/1: ep93xx: gpio interrupt support [ARM] 3361/1: S3C24XX - add USB bus clock source [ARM] 3360/1: S3C2440 - add set rate methods and camera clock [ARM] 3359/1: S3C24XX - add support for clk_set_rate [ARM] Convert kmalloc+memset to kzalloc [ARM] 3373/1: move uengine loader to arch/arm/common ...
2006-03-22Merge branch 'master'Jeff Garzik
2006-03-22[NETFILTER]: futher {ip,ip6,arp}_tables unificationDmitry Mishin
This patch moves {ip,ip6,arp}t_entry_{match,target} definitions to x_tables.h. This move simplifies code and future compatibility fixes. Signed-off-by: Dmitry Mishin <dim@openvz.org> Acked-off-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-22[NETFILTER]: nf_conntrack: support for layer 3 protocol load on demandPablo Neira Ayuso
x_tables matches and targets that require nf_conntrack_ipv[4|6] to work don't have enough information to load on demand these modules. This patch introduces the following changes to solve this issue: o nf_ct_l3proto_try_module_get: try to load the layer 3 connection tracker module and increases the refcount. o nf_ct_l3proto_module put: drop the refcount of the module. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-22[NETFILTER]: x_tables: set the protocol family in x_tables targets/matchesPablo Neira Ayuso
Set the family field in xt_[matches|targets] registered. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-22[ARM] 3381/1: ixp2000: fix slowport write timing control register fieldsLennert Buytenhek
Patch from Lennert Buytenhek The original version of the chip docs had the PW and SU fields in the slowport write timing control register accidentally reversed. This is mentioned in the errata (documentation change #4) and fixed in newer docs. Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-03-22[ARM] 3380/1: ixp2000: simplify ixdp2x00_master_npu() checkLennert Buytenhek
Patch from Lennert Buytenhek On the IXDP2x00s, the NPU that is PCI master is always the egress (i.e. 'master') NPU. At least on the IXDP2800, both NPUs have flash, so the ixp2000_has_flash() check in ixdp2x00_master_npu() is useless. Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-03-22[ARM] 3379/1: ixp2000: use generic 8250 debug macrosLennert Buytenhek
Patch from Lennert Buytenhek The xscale UART in the ixp2000 is basically just an 8250 UART (with some extra bits and pieces), so we can use the generic 8250 debug macros on the ixp2000. Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-03-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/perex/alsaLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/perex/alsa: (124 commits) [ALSA] version 1.0.11rc4 [PATCH] Intruduce DMA_28BIT_MASK [ALSA] hda-codec - Add support for ASUS P4GPL-X [ALSA] hda-codec - Add support for HP nx9420 laptop [ALSA] Fix memory leaks in error path of control.c [ALSA] AMD Au1x00: AC'97 controller is memory mapped [ALSA] AMD Au1x00: fix DMA init/cleanup [ALSA] hda-codec - Fix generic auto-configurator [ALSA] hda-codec - Fix BIOS auto-configuration [ALSA] Fixes typos in Audiophile-USB.txt [ALSA] ice1712 - typo fixes for dxr_enable module option [ALSA] AMD Au1x00: make driver build after cleanup [ALSA] ice1712 - Fix wrong value types for enum items [ALSA] fix resource leak in usbmixer [ALSA] Fix gus_pcm dereference before NULL [ALSA] Fix seq_clientmgr dereferences before NULL check [ALSA] hda-codec - Fix for Samsung R65 and ASUS A6J [ALSA] hda-codec - Add support for VAIO FE550G and SZ110 [ALSA] usb-audio: add Maya44 mixer control names [ALSA] usb-audio: add Casio PL-40R support ...
2006-03-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivialLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: fixed path to moved file in include/linux/device.h Fix spelling in E1000_DISABLE_PACKET_SPLIT Kconfig description Documentation/dvb/get_dvb_firmware: fix firmware URL Documentation: Update to BUG-HUNTING Remove superfluous NOTIFY_COOKIE_LEN define add "tags" to .gitignore Fix "frist", "fisrt", typos fix rwlock usage example It's UTF-8
2006-03-22Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC64]: Add a secondary TSB for hugepage mappings. [SPARC]: Respect vm_page_prot in io_remap_page_range().
2006-03-22Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [TG3]: Bump driver version and reldate. [TG3]: Skip phy power down on some devices [TG3]: Fix SRAM access during tg3_init_one() [X25]: dte facilities 32 64 ioctl conversion [X25]: allow ITU-T DTE facilities for x25 [X25]: fix kernel error message 64 bit kernel [X25]: ioctl conversion 32 bit user to 64 bit kernel [NET]: socket timestamp 32 bit handler for 64 bit kernel [NET]: allow 32 bit socket ioctl in 64 bit kernel [BLUETOOTH]: Return negative error constant
2006-03-22Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (138 commits) [SCSI] libata: implement minimal transport template for ->eh_timed_out [SCSI] eliminate rphy allocation in favour of expander/end device allocation [SCSI] convert mptsas over to end_device/expander allocations [SCSI] allow displaying and setting of cache type via sysfs [SCSI] add scsi_mode_select to scsi_lib.c [SCSI] 3ware 9000 add big endian support [SCSI] qla2xxx: update MAINTAINERS [SCSI] scsi: move target_destroy call [SCSI] fusion - bump version [SCSI] fusion - expander hotplug suport in mptsas module [SCSI] fusion - exposing raid components in mptsas [SCSI] fusion - memory leak, and initializing fields [SCSI] fusion - exclosure misspelled [SCSI] fusion - cleanup mptsas event handling functions [SCSI] fusion - removing target_id/bus_id from the VirtDevice structure [SCSI] fusion - static fix's [SCSI] fusion - move some debug firmware event debug msgs to verbose level [SCSI] fusion - loginfo header update [SCSI] add scsi_reprobe_device [SCSI] megaraid_sas: fix extended timeout handling ...
2006-03-22[PATCH] page migration reorgChristoph Lameter
Centralize the page migration functions in anticipation of additional tinkering. Creates a new file mm/migrate.c 1. Extract buffer_migrate_page() from fs/buffer.c 2. Extract central migration code from vmscan.c 3. Extract some components from mempolicy.c 4. Export pageout() and remove_from_swap() from vmscan.c 5. Make it possible to configure NUMA systems without page migration and non-NUMA systems with page migration. I had to so some #ifdeffing in mempolicy.c that may need a cleanup. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22[PATCH] hugepage: is_aligned_hugepage_range() cleanupDavid Gibson
Quite a long time back, prepare_hugepage_range() replaced is_aligned_hugepage_range() as the callback from mm/mmap.c to arch code to verify if an address range is suitable for a hugepage mapping. is_aligned_hugepage_range() stuck around, but only to implement prepare_hugepage_range() on archs which didn't implement their own. Most archs (everything except ia64 and powerpc) used the same implementation of is_aligned_hugepage_range(). On powerpc, which implements its own prepare_hugepage_range(), the custom version was never used. In addition, "is_aligned_hugepage_range()" was a bad name, because it suggests it returns true iff the given range is a good hugepage range, whereas in fact it returns 0-or-error (so the sense is reversed). This patch cleans up by abolishing is_aligned_hugepage_range(). Instead prepare_hugepage_range() is defined directly. Most archs use the default version, which simply checks the given region is aligned to the size of a hugepage. ia64 and powerpc define custom versions. The ia64 one simply checks that the range is in the correct address space region in addition to being suitably aligned. The powerpc version (just as previously) checks for suitable addresses, and if necessary performs low-level MMU frobbing to set up new areas for use by hugepages. No libhugetlbfs testsuite regressions on ppc64 (POWER5 LPAR). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: William Lee Irwin III <wli@holomorphy.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22[PATCH] hugepage: Move hugetlb_free_pgd_range() prototype to hugetlb.hDavid Gibson
The optional hugepage callback, hugetlb_free_pgd_range() is presently implemented non-trivially only on ia64 (but I plan to add one for powerpc shortly). It has its own prototype for the function in asm-ia64/pgtable.h. However, since the function is called from generic code, it make sense for its prototype to be in the generic hugetlb.h header file, as the protypes other arch callbacks already are (prepare_hugepage_range(), set_huge_pte_at(), etc.). This patch makes it so. Signed-off-by: David Gibson <dwg@au1.ibm.com> Cc: William Lee Irwin III <wli@holomorphy.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22[PATCH] hugepage: Fix hugepage logic in free_pgtables()David Gibson
free_pgtables() has special logic to call hugetlb_free_pgd_range() instead of the normal free_pgd_range() on hugepage VMAs. However, the test it uses to do so is incorrect: it calls is_hugepage_only_range on a hugepage sized range at the start of the vma. is_hugepage_only_range() will return true if the given range has any intersection with a hugepage address region, and in this case the given region need not be hugepage aligned. So, for example, this test can return true if called on, say, a 4k VMA immediately preceding a (nicely aligned) hugepage VMA. At present we get away with this because the powerpc version of hugetlb_free_pgd_range() is just a call to free_pgd_range(). On ia64 (the only other arch with a non-trivial is_hugepage_only_range()) we get away with it for a different reason; the hugepage area is not contiguous with the rest of the user address space, and VMAs are not permitted in between, so the test can't return a false positive there. Nonetheless this should be fixed. We do that in the patch below by replacing the is_hugepage_only_range() test with an explicit test of the VMA using is_vm_hugetlb_page(). This in turn changes behaviour for platforms where is_hugepage_only_range() returns false always (everything except powerpc and ia64). We address this by ensuring that hugetlb_free_pgd_range() is defined to be identical to free_pgd_range() (instead of a no-op) on everything except ia64. Even so, it will prevent some otherwise possible coalescing of calls down to free_pgd_range(). Since this only happens for hugepage VMAs, removing this small optimization seems unlikely to cause any trouble. This patch causes no regressions on the libhugetlbfs testsuite - ppc64 POWER5 (8-way), ppc64 G5 (2-way) and i386 Pentium M (UP). Signed-off-by: David Gibson <dwg@au1.ibm.com> Cc: William Lee Irwin III <wli@holomorphy.com> Acked-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22[PATCH] hugepage: Make {alloc,free}_huge_page() localDavid Gibson
Originally, mm/hugetlb.c just handled the hugepage physical allocation path and its {alloc,free}_huge_page() functions were used from the arch specific hugepage code. These days those functions are only used with mm/hugetlb.c itself. Therefore, this patch makes them static and removes their prototypes from hugetlb.h. This requires a small rearrangement of code in mm/hugetlb.c to avoid a forward declaration. This patch causes no regressions on the libhugetlbfs testsuite (ppc64, POWER5). Signed-off-by: David Gibson <dwg@au1.ibm.com> Cc: William Lee Irwin III <wli@holomorphy.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22[PATCH] hugepage: Strict page reservation for hugepage inodesDavid Gibson
These days, hugepages are demand-allocated at first fault time. There's a somewhat dubious (and racy) heuristic when making a new mmap() to check if there are enough available hugepages to fully satisfy that mapping. A particularly obvious case where the heuristic breaks down is where a process maps its hugepages not as a single chunk, but as a bunch of individually mmap()ed (or shmat()ed) blocks without touching and instantiating the pages in between allocations. In this case the size of each block is compared against the total number of available hugepages. It's thus easy for the process to become overcommitted, because each block mapping will succeed, although the total number of hugepages required by all blocks exceeds the number available. In particular, this defeats such a program which will detect a mapping failure and adjust its hugepage usage downward accordingly. The patch below addresses this problem, by strictly reserving a number of physical hugepages for hugepage inodes which have been mapped, but not instatiated. MAP_SHARED mappings are thus "safe" - they will fail on mmap(), not later with an OOM SIGKILL. MAP_PRIVATE mappings can still trigger an OOM. (Actually SHARED mappings can technically still OOM, but only if the sysadmin explicitly reduces the hugepage pool between mapping and instantiation) This patch appears to address the problem at hand - it allows DB2 to start correctly, for instance, which previously suffered the failure described above. This patch causes no regressions on the libhugetblfs testsuite, and makes a test (designed to catch this problem) pass which previously failed (ppc64, POWER5). Signed-off-by: David Gibson <dwg@au1.ibm.com> Cc: William Lee Irwin III <wli@holomorphy.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22[PATCH] Enable mprotect on huge pagesZhang, Yanmin
2.6.16-rc3 uses hugetlb on-demand paging, but it doesn_t support hugetlb mprotect. From: David Gibson <david@gibson.dropbear.id.au> Remove a test from the mprotect() path which checks that the mprotect()ed range on a hugepage VMA is hugepage aligned (yes, really, the sense of is_aligned_hugepage_range() is the opposite of what you'd guess :-/). In fact, we don't need this test. If the given addresses match the beginning/end of a hugepage VMA they must already be suitably aligned. If they don't, then mprotect_fixup() will attempt to split the VMA. The very first test in split_vma() will check for a badly aligned address on a hugepage VMA and return -EINVAL if necessary. From: "Chen, Kenneth W" <kenneth.w.chen@intel.com> On i386 and x86-64, pte flag _PAGE_PSE collides with _PAGE_PROTNONE. The identify of hugetlb pte is lost when changing page protection via mprotect. A page fault occurs later will trigger a bug check in huge_pte_alloc(). The fix is to always make new pte a hugetlb pte and also to clean up legacy code where _PAGE_PRESENT is forced on in the pre-faulting day. Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: William Lee Irwin III <wli@holomorphy.com> Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22[PATCH] mm: optimise page_countNick Piggin
Optimise page_count compound page test and make it consistent with similar functions. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22[PATCH] remove set_page_count() outside mm/Nick Piggin
set_page_count usage outside mm/ is limited to setting the refcount to 1. Remove set_page_count from outside mm/, and replace those users with init_page_count() and set_page_refcounted(). This allows more debug checking, and tighter control on how code is allowed to play around with page->_count. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22[PATCH] mm: nommu use compound pagesNick Piggin
Now that compound page handling is properly fixed in the VM, move nommu over to using compound pages rather than rolling their own refcounting. nommu vm page refcounting is broken anyway, but there is no need to have divergent code in the core VM now, nor when it gets fixed. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: David Howells <dhowells@redhat.com> (Needs testing, please). Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22[PATCH] mm: make __put_page internalNick Piggin
Remove __put_page from outside the core mm/. It is dangerous because it does not handle compound pages nicely, and misses 1->0 transitions. If a user later appears that really needs the extra speed we can reevaluate. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22[PATCH] vmscan: use unsigned longsAndrew Morton
Turn basically everything in vmscan.c into `unsigned long'. This is to avoid the possibility that some piece of code in there might decide to operate upon more than 4G (or even 2G) of pages in one hit. This might be silly, but we'll need it one day. Cc: Christoph Lameter <clameter@sgi.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22[PATCH] on_each_cpu(): disable local interruptsAndrew Morton
When on_each_cpu() runs the callback on other CPUs, it runs with local interrupts disabled. So we should run the function with local interrupts disabled on this CPU, too. And do the same for UP, so the callback is run in the same environment on both UP and SMP. (strictly it should do preempt_disable() too, but I think local_irq_disable is sufficiently equivalent). Also uninlines on_each_cpu(). softirq.c was the most appropriate file I could find, but it doesn't seem to justify creating a new file. Oh, and fix up that comment over (under?) x86's smp_call_function(). It drives me nuts. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22[PATCH] slab: Remove SLAB_NO_REAP optionChristoph Lameter
SLAB_NO_REAP is documented as an option that will cause this slab not to be reaped under memory pressure. However, that is not what happens. The only thing that SLAB_NO_REAP controls at the moment is the reclaim of the unused slab elements that were allocated in batch in cache_reap(). Cache_reap() is run every few seconds independently of memory pressure. Could we remove the whole thing? Its only used by three slabs anyways and I cannot find a reason for having this option. There is an additional problem with SLAB_NO_REAP. If set then the recovery of objects from alien caches is switched off. Objects not freed on the same node where they were initially allocated will only be reused if a certain amount of objects accumulates from one alien node (not very likely) or if the cache is explicitly shrunk. (Strangely __cache_shrink does not check for SLAB_NO_REAP) Getting rid of SLAB_NO_REAP fixes the problems with alien cache freeing. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22[PATCH] kcalloc(): INT_MAX -> ULONG_MAXAdrian Bunk
Since size_t has the same size as a long on all architectures, it's enough for overflow checks to check against ULONG_MAX. This change could allow a compiler better optimization (especially in the n=1 case). The practical effect seems to be positive, but quite small: text data bss dec hex filename 21762380 5859870 1848928 29471178 1c1b1ca vmlinux-old 21762211 5859870 1848928 29471009 1c1b121 vmlinux-patched Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22[PATCH] mm: page_state comment moreNick Piggin
Clarify that preemption needs to be guarded against with the __xxx_page_state functions. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22[PATCH] mm: split highorder pagesNick Piggin
Have an explicit mm call to split higher order pages into individual pages. Should help to avoid bugs and be more explicit about the code's intention. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Russell King <rmk@arm.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22[PATCH] mm: de-skew page refcountingNick Piggin
atomic_add_unless (atomic_inc_not_zero) no longer requires an offset refcount to function correctly. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22[PATCH] mm: simplify vmscan vs release refcountingNick Piggin
The VM has an interesting race where a page refcount can drop to zero, but it is still on the LRU lists for a short time. This was solved by testing a 0->1 refcount transition when picking up pages from the LRU, and dropping the refcount in that case. Instead, use atomic_add_unless to ensure we never pick up a 0 refcount page from the LRU, thus a 0 refcount page will never have its refcount elevated until it is allocated again. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22[PATCH] mm: slab less atomicsNick Piggin
Atomic operation removal from slab Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>