aboutsummaryrefslogtreecommitdiff
path: root/include/linux/raid
AgeCommit message (Collapse)Author
2005-11-09[PATCH] md: complete conversion of md to use kthreadsNeilBrown
There are a few loose ends following the conversion of md to use kthreads: - Some fields in mdk_thread_t that aren't needed (kthreads does it's own completion and manages it's own name). - thread->run is now never NULL, so no need to check - Some tests for signal_pending that aren't needed (As we don't use signals to stop threads any more) - Some flush_signals are not needed - Some waits are interruptible and don't need to be. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09[PATCH] md: support BIO_RW_BARRIER for md/raid1NeilBrown
We can only accept BARRIER requests if all slaves handle barriers, and that can, of course, change with time.... So we keep track of whether the whole array seems safe for barriers, and also whether each individual rdev handles barriers. We initially assumes barriers are OK. When writing the superblock we try a barrier, and if that fails, we flag things for no-barriers. This will usually clear the flags fairly quickly. If writing the superblock finds that BIO_RW_BARRIER is -ENOTSUPP, we need to resubmit, so introduce function "md_super_wait" which waits for requests to finish, and retries ENOTSUPP requests without the barrier flag. When writing the real raid1, write requests which were BIO_RW_BARRIER but which aresn't supported need to be retried. So raid1d is enhanced to do this, and when any bio write completes (i.e. no retry needed) we remove it from the r1bio, so that devices needing retry are easy to find. We should hardly ever get -ENOTSUPP errors when writing data to the raid. It should only happen if: 1/ the device used to support BARRIER, but now doesn't. Few devices change like this, though raid1 can! or 2/ the array has no persistent superblock, so there was no opportunity to pre-test for barriers when writing the superblock. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09[PATCH] md: make md on-disk bitmaps not host-endianNeilBrown
Current bitmaps use set_bit et.al and so are host-endian, which means not-portable. Oops. Define a new version number (4) for which bitmaps are little-endian. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09[PATCH] md: convert 'faulty' and 'in_sync' fields to bits in 'flags' fieldNeilBrown
This has the advantage of removing the confusion caused by 'rdev_t' and 'mddev_t' both having 'in_sync' fields. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09[PATCH] md: improvements to raid5 handling of read errorsNeilBrown
Two refinements to the 'attempt-overwrite-on-read-error' mechanism. 1/ If the array is read-only, don't attempt an over-write. 2/ If there are more than max_nr_stripes read errors on a device with no success, fail the drive. This will make sure a dead drive will be eventually kicked even when we aren't trying to rewrite (which would normally kick a dead drive more quickly. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09[PATCH] md: change raid5 sysfs attribute to not create a new directoryNeilBrown
There isn't really a need for raid5 attributes to be an a subdirectory, so this patch moves them from /sys/block/mdX/md/raid5/attribute to /sys/block/mdX/md/attribute This suggests that all md personalities should co-operate about namespace usage, but that shouldn't be a problem. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09[PATCH] md: teach raid5 the difference between 'check' and 'repair'.NeilBrown
With this, raid5 can be asked to check parity without repairing it. It also keeps a count of the number of incorrect parity blocks found (mismatches) and reports them through sysfs. Signed-off-by: Neil Brown <neilb@suse.de> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09[PATCH] md: allow a manual resync with mdNeilBrown
You can trigger a 'check' with echo check > /sys/block/mdX/md/scan_mode or a check-and-repair errors with echo repair > /sys/block/mdX/md/scan_mode and read the current state from the same file. Note: personalities need to know the different between 'check' and 'repair', but don't yet. Until they do, 'check' will be the same as 'repair' and will just do a normal resync pass. Signed-off-by: Neil Brown <neilb@suse.de> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09[PATCH] md: add kobject/sysfs support to raid5NeilBrown
/sys/block/mdX/md/raid5/ contains raid5-related attributes. Currently stripe_cache_size is number of entries in stripe cache, and is settable. stripe_cache_active is number of active entries, and in only readable. Signed-off-by: Neil Brown <neilb@suse.de> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09[PATCH] md: extend md sysfs support to component devices.NeilBrown
Each device in an md array how has a corresponding /sys/block/mdX/md/devNN/ directory which can contain attributes. Currently there is only 'state' which summarises the state, nd 'super' which has a copy of the superblock, and 'block' which is a symlink to the block device. Also, /sys/block/mdX/md/rdNN represents slot 'NN' in the array, and is a symlink to the relevant 'devNN'. Obviously spare devices do not have a slot in the array, and so don't have such a symlink. Signed-off-by: Neil Brown <neilb@suse.de> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09[PATCH] md: initial sysfs support for mdNeilBrown
Start using kobjects in mddevs, and provide a couple of simple attributes (level and disks). Attributes live in /sys/block/mdX/md/attr-name Signed-off-by: Neil Brown <neilb@suse.de> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09[PATCH] md: better handling of readerrors with raid5.NeilBrown
This patch changes the behaviour of raid5 when it gets a read error. Instead of just failing the device, it tried to find out what should have been there, and writes it over the bad block. For some media-errors, this has a reasonable chance of fixing the error. If the write succeeds, and a subsequent read succeeds as well, raid5 decided the address is OK and conitnues. Instead of failing a drive on read-error, we attempt to re-write the block, and then re-read. If that all works, we allow the device to remain in the array. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] md: add write-intent-bitmap support to raid5NeilBrown
Most awkward part of this is delaying write requests until bitmap updates have been flushed. To achieve this, we have a sequence number (seq_flush) which is incremented each time the raid5 is unplugged. If the raid thread notices that this has changed, it flushes bitmap changes, and assigned the value of seq_flush to seq_write. When a write request arrives, it is given the number from seq_write, and that write request may not complete until seq_flush is larger than the saved seq number. We have a new queue for storing stripes which are waiting for a bitmap flush and an extra flag for stripes to record if the write was 'degraded' and so should not clear the a bit in the bitmap. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] md: limit size of sb read/written to appropriate amountNeilBrown
version-1 superblocks are not (normally) 4K long, and can be of variable size. Writing the full 4K can cause corruption (but only in non-default configurations). With this patch the super-block-flavour can choose a size to read, and set a size to write based on what it finds. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] md: remove old cruft from md_k.h header fileNeilBrown
These inlines haven't been used for ages, they should go. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] md: allow md to load a superblock with feature-bit '1' setNeilBrown
As this is used to flag an internal bitmap. Also, introduce symbolic names for feature bits. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] md: support md/linear array with components greater than 2 terabytes.NeilBrown
linear currently uses division by the size of the smallest componenet device to find which device a request goes to. If that smallest device is larger than 2 terabytes, then the division will not work on some systems. So we introduce a pre-shift, and take care not to make the hash table too large, much like the code in raid0. Also get rid of conf->nr_zones, which is not needed. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] md: add write-behind support for md/raid1NeilBrown
If a device is flagged 'WriteMostly' and the array has a bitmap, and the bitmap superblock indicates that write_behind is allowed, then write_behind is enabled for WriteMostly devices. Write requests will be acknowledges as complete to the caller (via b_end_io) when all non-WriteMostly devices have completed the write, but will not be cleared from the bitmap until all devices complete. This requires memory allocation to make a local copy of the data being written. If there is insufficient memory, then we fall-back on normal write semantics. Signed-Off-By: Paul Clements <paul.clements@steeleye.com> Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] md: support write-mostly device in raid1NeilBrown
This allows a device in a raid1 to be marked as "write mostly". Read requests will only be sent if there is no other option. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] md: all hot-add and hot-remove of md intent logging bitmapsNeilBrown
Both file-bitmaps and superblock bitmaps are supported. If you add a bitmap file on the array device, you lose. This introduces a 'default_bitmap_offset' field in mddev, as the ioctl used for adding a superblock bitmap doesn't have room for giving an offset. Later, this value will be setable via sysfs. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-04[PATCH] md: make sure md bitmap updates are flushed when array is stopped.NeilBrown
The recent change to never ignore the bitmap, revealed that the bitmap isn't begin flushed properly when an array is stopped. We call bitmap_daemon_work three times as there is a three-stage pipeline for flushing updates to the bitmap file. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-15[PATCH] md/raid1: clear bitmap when fullsync completesNeilBrown
We need to be careful differentiating between a resync of a complete array, in which we can clear the bitmap, and a resync of a degraded array, in which we cannot. This patch cleans all that up. Cc: Paul Clements <paul.clements@steeleye.com> Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21[PATCH] Two small fixes for md verion-1 superblocks.NeilBrown
1/ Must typecast int to (sector_t) before inverting or we might not invert enough bits. 2/ When "bitmap_offset" was added to mdp_superblock_1, we didn't increase the count of words-used (96 to 100). Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21[PATCH] md: allow md to update multiple superblocks in parallel.NeilBrown
currently, md updates all superblocks (one on each device) in series. It waits for one write to complete before starting the next. This isn't a big problem as superblock updates don't happen that often. However it is neater to do it in parallel, and if the drives in the array have gone to "sleep" after a period of idleness, then waking them is parallel is faster (and someone else should be worrying about power drain). Futher, we will need parallel superblock updates for a future patch which keeps the intent-logging bitmap near the superblock. Also remove the silly code that retired superblock updates 100 times. This simply never made sense. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21[PATCH] md: allow md intent bitmap to be stored near the superblock.NeilBrown
This provides an alternate to storing the bitmap in a separate file. The bitmap can be stored at a given offset from the superblock. Obviously the creator of the array must make sure this doesn't intersect with data.... After is good for version-0.90 superblocks. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21[PATCH] md: fix deadlock due to md thread processing delayed requests.NeilBrown
Before completing a 'write' the md superblock might need to be updated. This is best done by the md_thread. The current code schedules this up and queues the write request for later handling by the md_thread. However some personalities (Raid5/raid6) will deadlock if the md_thread tries to submit requests to its own array. So this patch changes things so the processes submitting the request waits for the superblock to be written and then submits the request itself. This fixes a recently-created deadlock in raid5/raid6 Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21[PATCH] md: optimise reconstruction when re-adding a recently failed drive.NeilBrown
When an array is degraded, bit in the intent-bitmap are never cleared. So if a recently failed drive is re-added, we only need to reconstruct the block that are still reflected in the bitmap. This patch adds support for this re-adding. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21[PATCH] md: raid1 support for bitmap intent loggingNeilBrown
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21[PATCH] md: enable the bitmap write-back daemon and wait for it.NeilBrown
Currently we don't wait for updates to the bitmap to be flushed to disk properly. The infrastructure all there, but it isn't being used.... A separate kernel thread (bitmap_writeback_daemon) is needed to wait for each page as we cannot get callbacks when a page write completes. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21[PATCH] md: optimised resync using Bitmap based intent loggingNeilBrown
With this patch, the intent to write to some block in the array can be logged to a bitmap file. Each bit represents some number of sectors and is set before any update happens, and only cleared when all writes relating to all sectors are complete. After an unclean shutdown, information in this bitmap can be used to optimise resync - only sectors which could be out-of-sync need to be updated. Also if a drive is removed and then added back into an array, the recovery can make use of the bitmap to optimise reconstruction. This is not implemented in this patch. Currently the bitmap is stored in a file which must (obviously) be stored on a separate device. The patch only provided infrastructure. It does not update any personalities to bitmap intent logging. Md arrays can still be used with no bitmap file. This patch has minimal impact on such arrays. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21[PATCH] md: improve the interface to sync_requestNeilBrown
1/ change the return value (which is number-of-sectors synced) from 'int' to 'sector_t'. The number of sectors is usually easily small enough to fit in an int, but if resync needs to abort, it may want to return the total number of remaining sectors, which could be large. Also errors cannot be returned as negative numbers now, so use 0 instead 2/ Add a 'skipped' return parameter to allow the array to report that it skipped the sectors. This allows md to take this into account in the speed calculations. Currently there is no important skipping, but the bitmap-based-resync that is coming will use this. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21[PATCH] md: improve locking on 'safemode' and move superblock writesNeilBrown
When md marks the superblock dirty before a write, it calls generic_make_request (to write the superblock) from within generic_make_request (to write the first dirty block), which could cause problems later. With this patch, the superblock write is always done by the helper thread, and write request are delayed until that write completes. Also, the locking around marking the array dirty and writing the superblock is improved to avoid possible races. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-16Linux-2.6.12-rc2Linus Torvalds
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!