aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2008-02-06x86: trivial sparse/checkpatch in quirks.cHarvey Harrison
arch/x86/kernel/quirks.c:384:3: warning: returning void-valued expression arch/x86/kernel/quirks.c:387:3: warning: returning void-valued expression arch/x86/kernel/quirks.c:390:3: warning: returning void-valued expression arch/x86/kernel/quirks.c:393:3: warning: returning void-valued expression Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-06x86 ptrace: disallow null cs/ssRoland McGrath
In my revamp of the x86 ptrace code for setting register values, I accidentally omitted a check that was there in the old code. Allowing %cs to be 0 causes a bad crash in recovery from iret failure. This patch fixes that regression against 2.6.24, and adds a comment that should help prevent this subtlety from being overlooked again. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-06MAINTAINERS: RDC R-321x SoC maintainerFlorian Fainelli
Signed-off-by: Florian Fainelli <florian.fainelli@telecomint.eu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-06brk randomization: introduce CONFIG_COMPAT_BRKIngo Molnar
based on similar patch from: Pavel Machek <pavel@ucw.cz> Introduce CONFIG_COMPAT_BRK. If disabled then the kernel is free (but not obliged to) randomize the brk area. Heap randomization breaks ancient binaries, so we keep COMPAT_BRK enabled by default. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-06brk: check the lower bound properlyJiri Kosina
There is a check in sys_brk(), that tries to make sure that we do not underflow the area that is dedicated to brk heap. The check is however wrong, as it assumes that brk area starts immediately after the end of the code (+bss), which is wrong for example in environments with randomized brk start. The proper way is to check whether the address is not below the start_brk address. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-06x86: remove X2 workaroundIngo Molnar
With the spurious handler fix, the X2 does not lock up anymore. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-06x86: make spurious fault handler aware of large mappingsThomas Gleixner
In very rare cases, on certain CPUs, we could end up in the spurious fault handler and ignore a large pud/pmd mapping. The resulting pte pointer points into the mapped physical space and dereferencing it will fault recursively. Make the code aware of large mappings and do the permission check on the pmd/pud entry, when a large pud/pmd mapping is detected. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-06x86: make traps on entry code be debuggable in user space, 64-bitRoland McGrath
Unify the x86-64 behavior for 32-bit processes that set bogus %cs/%ss values (the only ones that can fault in iret) match what the native i386 behavior is. (do not kill the task via do_exit but generate a SIGSEGV signal) [ tglx@linutronix.de: build fix ] Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-06Merge branch 'async-tx-for-linus' of ↵Linus Torvalds
git://lost.foo-projects.org/~dwillia2/git/iop into fix * 'async-tx-for-linus' of git://lost.foo-projects.org/~dwillia2/git/iop: async_tx: allow architecture specific async_tx_find_channel implementations async_tx: replace 'int_en' with operation preparation flags async_tx: kill tx_set_src and tx_set_dest methods async_tx: kill ASYNC_TX_ASSUME_COHERENT iop-adma: use LIST_HEAD instead of LIST_HEAD_INIT async_tx: use LIST_HEAD instead of LIST_HEAD_INIT async_tx: fix compile breakage, mark do_async_xor __always_inline
2008-02-06scsi: megaraid: trivial drop duplicate mutex.h includeDaniel Walker
Signed-off-by: Daniel Walker <dwalker@mvista.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6: SELinux: Remove security_get_policycaps() security: allow Kconfig to set default mmap_min_addr protection
2008-02-06Merge branch 'upstream-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: ata_piix.c:piix_init_one() must be __devinit sata_via.c: Remove missleading comment. libata-core: unblacklist HITACHI drives sata_nv: fix ATAPI issues with memory over 4GB (v7) ata: drivers/ata/sata_mv.c needs dmapool.h libata: kill now unused n_iter and fix sata_fsl ahci: fix CAP.NP and PI handling sata_mv: Support SoC controllers Rename: linux/pata_platform.h to linux/ata_platform.h
2008-02-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (35 commits) virtio net: fix oops on interface-up Fix PHY Lib support for gianfar and ucc_geth forcedeth: preserve registers forcedeth: phy status fix forcedeth: restart tx/rx ipvs: Make wrr "no available servers" error message rate-limited [PPPOL2TP]: Label unused warning when CONFIG_PROC_FS is not set. [NET_SCHED]: cls_flow: support classification based on VLAN tag [VLAN]: Constify skb argument to vlan_get_tag() [NET_SCHED]: cls_flow: fix key mask validity check [NET_SCHED]: em_meta: fix compile warning b43: Fix DMA for 30/32-bit DMA engines b43: fix build with CONFIG_SSB_PCIHOST=n mac80211: Is not EXPERIMENTAL anymore iwl3945-base.c: fix off-by-one errors b43legacy: fix DMA slot resource leakage b43legacy: drop packets we are not able to encrypt b43legacy: fix suspend/resume b43legacy: fix PIO crash Generic HDLC - use random_ether_addr() ...
2008-02-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC64]: Temporarily remove IOMMU merging code. [SPARC64]: Update defconfig. [SPARC]: Add new timerfd syscall entries.
2008-02-06fb: fix warning: no return statement in function returning non-voidAnton Vorontsov
Warning is reproducible with selected FB_CFB_REV_PIXELS_IN_BYTE. CC drivers/video/sysfillrect.o In file included from drivers/video/sysfillrect.c:18: drivers/video/fb_draw.h: In function `fb_rev_pixels_in_long': drivers/video/fb_draw.h:94: warning: no return statement in function returning non-void CC drivers/video/syscopyarea.o In file included from drivers/video/syscopyarea.c:22: drivers/video/fb_draw.h: In function `fb_rev_pixels_in_long': drivers/video/fb_draw.h:94: warning: no return statement in function returning non-void Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Cc: "Antonino A. Daplas" <adaplas@pol.net> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06virtio: add missing #include <linux/delay.h>Johann Felix Soden
Include linux/delay.h to fix compiler error: drivers/virtio/virtio_balloon.c: In function 'fill_balloon': drivers/virtio/virtio_balloon.c:98: error: implicit declaration of function 'msleep' Signed-off-by: Johann Felix Soden <johfel@users.sourceforge.net> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06ext3: fix lock inversion in direct IOJan Kara
We cannot start transaction in ext3_direct_IO() and just let it last during the whole write because dio_get_page() acquires mmap_sem which ranks above transaction start (e.g. because we have dependency chain mmap_sem->PageLock->journal_start, or because we update atime while holding mmap_sem) and thus deadlocks could happen. We solve the problem by starting a transaction separately for each ext3_get_block() call. We *could* have a problem that we allocate a block and before its data are written out the machine crashes and thus we expose stale data. But that does not happen because for hole-filling generic code falls back to buffered writes and for file extension, we add inode to orphan list and thus in case of crash, journal replay will truncate inode back to the original size. [akpm@linux-foundation.org: build fix] Signed-off-by: Jan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Cc: Zach Brown <zach.brown@oracle.com> Cc: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06jbd.h: hide kernel only codeOlaf Hering
Move a few kernel-only things into __KERNEL__. Signed-off-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06ext3: remove unused code from ext3_find_entry()Mariusz Kozlowski
Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06ext[234]: cleanup ext[234]_bg_num_gdb()Akinobu Mita
Use ext[234]_bg_has_super() to remove duplicate code. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06ext[234]: remove unused argument for ext[234]_find_goal()Akinobu Mita
The argument chain for ext[234]_find_goal() is not used. This patch removes it and fixes comment as well. Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06ext[234]: use ext[234]_get_group_desc()Akinobu Mita
Use ext[234]_get_group_desc() to get group descriptor from group number. Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06ext[234]: fix comment for nonexistent variableAkinobu Mita
The comment in ext[234]_new_blocks() describes about "i". But there is no local variable called "i" in that scope. I guess it has been renamed to group_no. Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06ext3: change the default behaviour on errorAneesh Kumar K.V
ext3 file system was by default ignoring errors and continuing. This is not a good default as continuing on error could lead to file system corruption. Change the default to mark the file system readonly. Debian and ubuntu already does this as the default in their fstab. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: <linux-ext4@vger.kernel.org> Cc: Eric Sandeen <sandeen@redhat.com> Cc: Jan Kara <jack@ucw.cz> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06ext3: return after ext3_error in case of failuresAneesh Kumar K.V
This fixes some instances where we were continuing after calling ext3_error. ext3_error calls panic only if errors=panic mount option is set. So we need to make sure we return correctly after ext3_error call Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06make jbd/journal.c:__journal_abort_hard() staticAdrian Bunk
__journal_abort_hard() can now become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06BKL-removal: remove incorrect comment refering to lock_kernel() from jbd/jbd2Andi Kleen
None of the callers of this function does actually take the BKL as far as I can see. So remove the comment refering to the BKL. Signed-off-by: Andi Kleen <ak@suse.de> Cc: <linux-ext4@vger.kernel.org> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06BKL-removal: remove incorrect BKL comment in ext2Andi Kleen
No BKL used anywhere, so don't mention it. Signed-off-by: Andi Kleen <ak@suse.de> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06BKL-removal: convert ext2 over to use unlocked_ioctlAndi Kleen
I checked ext2_ioctl and could not find anything in there that would need the BKL. So convert it over to use unlocked_ioctl Signed-off-by: Andi Kleen <ak@suse.de> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06ext3: add block bitmap validationAneesh Kumar K.V
When a new block bitmap is read from disk in read_block_bitmap() there are a few bits that should ALWAYS be set. In particular, the blocks given corresponding to block bitmap, inode bitmap and inode tables. Validate the block bitmap against these blocks. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06ext2: add block bitmap validationAneesh Kumar K.V
When a new block bitmap is read from disk in read_block_bitmap() there are a few bits that should ALWAYS be set. In particular, the blocks given corresponding to block bitmap, inode bitmap and inode tables. Validate the block bitmap against these blocks. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06PNP: disable Supermicro H8DCE motherboard resources that overlap SATA BARsBjorn Helgaas
Some Supermicro BIOSes describe a SATA PCI BAR as a motherboard resource. The PNP system driver claims motherboard resources, and this prevents the sata_nv driver from requesting it later. This patch disables the PNP0C01/PNP0C02 resources so they won't be claimed by the PNP system driver, so they'll available for sata_nv. This fixes the bugs below, where sata_nv detects only two out of four SATA drives. The signature includes dmesg lines similar to these: pnp: 00:09: iomem range 0xdfefc000-0xdfefcfff has been reserved pnp: 00:09: iomem range 0xdfefd000-0xdfefd3ff has been reserved pnp: 00:09: iomem range 0xdfefe000-0xdfefe3ff has been reserved PCI: Unable to reserve mem region #6:1000@dfefd000 for device 0000:80:07.0 sata_nv: probe of 0000:80:07.0 failed with error -16 PCI: Unable to reserve mem region #6:1000@dfefe000 for device 0000:80:08.0 sata_nv: probe of 0000:80:08.0 failed with error -16 References: https://bugzilla.redhat.com/show_bug.cgi?id=280641 https://bugzilla.redhat.com/show_bug.cgi?id=313491 http://lkml.org/lkml/2008/1/9/449 http://thread.gmane.org/gmane.linux.acpi.devel/27312 This is post-2.6.24 material. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06PNP: do not test PNP_DRIVER_RES_DO_NOT_CHANGE on suspend/resumeRene Herman
The PNP_DRIVER_RES_DO_NOT_CHANGE flag is meant to signify that the PNP core should not change resources for the device -- not that it shouldn't disable/enable the device on suspend/resume. ALSA ISAPnP drivers set PNP_DRIVER_RES_DO_NOT_CHANAGE (0x0001) through setting PNP_DRIVER_RES_DISABLE (0x0003). The latter including the former may in itself be considered rather unexpected but doesn't change that suspend/resume wouldn't seem to have any business testing the flag. As reported by Ondrej Zary for snd-cs4236, ALSA driven ISAPnP cards don't survive swsusp hibernation with the resume skipping setting the resources due to testing the flag -- the same test in the suspend path isn't enough to keep hibernation from disabling the card it seems. These tests were added (in 2005) by Piere Ossman in commit 68094e3251a664ee1389fcf179497237cbf78331, "alsa: Improved PnP suspend support" who doesn't remember why. This deletes them. Signed-off-by: Rene Herman <rene.herman@gmail.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Pierre Ossman <drzeus@drzeus.cx> Cc: Adam Belay <ambx1@neo.rr.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06isapnp driver semaphore to mutexDaniel Walker
Changed the isapnp semaphore to a mutex. [akpm@linux-foundation.org: no externs-in-c] [akpm@linux-foundation.org: build fix] Signed-off-by: Daniel Walker <dwalker@mvista.com> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06pnp: declare PNP option parsing functions as __initThomas Renninger
There are three kind of parse functions provided by PNP acpi/bios: - get current resources - set resources - get possible resources The first two may be needed later at runtime. The possible resource settings should never change dynamically. And even if this would make any sense (I doubt it), the current implementation only parses possible resource settings at early init time: -> declare all the option parsing __init [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Thomas Renninger <trenn@suse.de> Acked-By: Rene Herman <rene.herman@gmail.com> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06simplify pnp_activate_dev() and pnp_disable_dev() return valuesBjorn Helgaas
Make pnp_activate_dev() and pnp_disable_dev() return only 0 (success) or a negative error value, as pci_enable_device() and pci_disable_device() do. Previously they returned: 0: device was already active (or disabled) 1: we just activated (or disabled) device <0: -EBUSY or error from pnp_start_dev() (or pnp_stop_dev()) Now we return only 0 (device is active or disabled) or <0 (error). All in-tree callers either ignore the return values or check only for errors (negative values). Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Adam Belay <ambx1@neo.rr.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06md: fix an occasional deadlock in raid5NeilBrown
raid5's 'make_request' function calls generic_make_request on underlying devices and if we run out of stripe heads, it could end up waiting for one of those requests to complete. This is bad as recursive calls to generic_make_request go on a queue and are not even attempted until make_request completes. So: don't make any generic_make_request calls in raid5 make_request until all waiting has been done. We do this by simply setting STRIPE_HANDLE instead of calling handle_stripe(). If we need more stripe_heads, raid5d will get called to process the pending stripe_heads which will call generic_make_request from a This change by itself causes a performance hit. So add a change so that raid5_activate_delayed is only called at unplug time, never in raid5. This seems to bring back the performance numbers. Calling it in raid5d was sometimes too soon... Neil said: How about we queue it for 2.6.25-rc1 and then about when -rc2 comes out, we queue it for 2.6.24.y? Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de> Tested-by: dean gaudet <dean@arctic.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06md: change ITERATE_RDEV_GENERIC to rdev_for_each_list, and remove ↵NeilBrown
ITERATE_RDEV_PENDING. Finish ITERATE_ to for_each conversion. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06md: change ITERATE_RDEV to rdev_for_eachNeilBrown
As this is more in line with common practice in the kernel. Also swap the args around to be more like list_for_each. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06md: change INTERATE_MDDEV to for_each_mddevNeilBrown
As this is more consistent with kernel style. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06md: change a few 'int' to 'size_t' in mdNeilBrown
As suggested by Andrew Morton. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06md: fix use-after-free bug when dropping an rdev from an md arrayNeilBrown
Due to possible deadlock issues we need to use a schedule work to kobject_del an 'rdev' object from a different thread. A recent change means that kobject_add no longer gets a refernce, and kobject_del doesn't put a reference. Consequently, we need to explicitly hold a reference to ensure that the last reference isn't dropped before the scheduled work get a chance to call kobject_del. Also, rename delayed_delete to md_delayed_delete to that it is more obvious in a stack trace which code is to blame. Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06md: allow an md array to appear with 0 drives if it has external metadataNeilBrown
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06md: lock address when changing attributes of component devicesNeilBrown
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06md: allow devices to be shared between md arraysNeilBrown
Currently, a given device is "claimed" by a particular array so that it cannot be used by other arrays. This is not ideal for DDF and other metadata schemes which have their own partitioning concept. So for externally managed metadata, just claim the device for md in general, require that "offset" and "size" are set properly for each device, and make sure that if a device is included in different arrays then the active sections do not overlap. This involves adding another flag to the rdev which makes it awkward to set "->flags = 0" to clear certain flags. So now clear flags explicitly by name when we want to clear things. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06md: set and test the ->persistent flag for md devices more consistentlyNeilBrown
If you try to start an array for which the number of raid disks is listed as zero, md will currently try to read metadata off any devices that have been given. This was done because the value of raid_disks is used to signal whether array details have been provided by userspace (raid_disks > 0) or must be read from the devices (raid_disks == 0). However for an array without persistent metadata (or with externally managed metadata) this is the wrong thing to do. So we add a test in do_md_run to give an error if raid_disks is zero for non-persistent arrays. This requires that mddev->persistent is set corrently at this point, which it currently isn't for in-kernel autodetected arrays. So set ->persistent for autodetect arrays, and remove the settign in super_*_validate which is now redundant. Also clear ->persistent when stopping an array so it is consistently zero when starting an array. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06md: allow a maximum extent to be set for resyncingNeilBrown
This allows userspace to control resync/reshape progress and synchronise it with other activities, such as shared access in a SAN, or backing up critical sections during a tricky reshape. Writing a number of sectors (which must be a multiple of the chunk size if such is meaningful) causes a resync to pause when it gets to that point. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06md: give userspace control over removing failed devices when external ↵NeilBrown
metdata in use When a device fails, we must not allow an further writes to the array until the device failure has been recorded in array metadata. When metadata is managed externally, this requires some synchronisation... Allow/require userspace to explicitly remove failed devices from active service in the array by writing 'none' to the 'slot' attribute. If this reduces the number of failed devices to 0, the write block will automatically be lowered. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06md: support 'external' metadata for md arraysNeilBrown
- Add a state flag 'external' to indicate that the metadata is managed externally (by user-space) so important changes need to be left of user-space to handle. Alternates are non-persistant ('none') where there is no stable metadata - after the array is stopped there is no record of it's status - and internal which can be version 0.90 or version 1.x These are selected by writing to the 'metadata' attribute. - move the updating of superblocks (sync_sbs) to after we have checked if there are any superblocks or not. - New array state 'write_pending'. This means that the metadata records the array as 'clean', but a write has been requested, so the metadata has to be updated to record a 'dirty' array before the write can continue. This change is reported to md by writing 'active' to the array_state attribute. - tidy up marking of sb_dirty: - don't set sb_dirty when resync finishes as md_check_recovery calls md_update_sb when the sync thread finishes anyway. - Don't set sb_dirty in multipath_run as the array might not be dirty. - don't mark superblock dirty when switching to 'clean' if there is no internal superblock (if external, userspace can choose to update the superblock whenever it chooses to). Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06md: Update md bitmap during resync.NeilBrown
Currently an md array with a write-intent bitmap does not updated that bitmap to reflect successful partial resync. Rather the entire bitmap is updated when the resync completes. This is because there is no guarentee that resync requests will complete in order, and tracking each request individually is unnecessarily burdensome. However there is value in regularly updating the bitmap, so add code to periodically pause while all pending sync requests complete, then update the bitmap. Doing this only every few seconds (the same as the bitmap update time) does not notciably affect resync performance. [snitzer@gmail.com: export bitmap_cond_end_sync] Signed-off-by: Neil Brown <neilb@suse.de> Cc: "Mike Snitzer" <snitzer@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>