aboutsummaryrefslogtreecommitdiff
path: root/fs/gfs2/lops.c
AgeCommit message (Collapse)Author
2007-08-14[GFS2] soft lockup detected in databuf_lo_before_commitBob Peterson
This is part 2 of the patch for bug #245832, part 1 of which is already in the git tree. The problem was that sdp->sd_log_num_databuf was not always being protected by the gfs2_log_lock spinlock, but the sd_log_le_databuf (which it is supposed to reflect) was protected. That meant there was a timing window during which gfs2_log_flush called databuf_lo_before_commit and the count didn't match what was really on the linked list in that window. So when it ran out of items on the linked list, it decremented total_dbuf from 0 to -1 and thus never left the "while(total_dbuf)" loop. The solution is to protect the variable sdp->sd_log_num_databuf so that the value will always match the contents of the linked list, and therefore the number will never go negative, and therefore, the loop will be exited properly. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09[GFS2] Addendum to the journaled file/unmount patchRobert Peterson
This patch is an addendum to the previous journaled file/unmount patch. It fixes a problem discovered during testing. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09[GFS2] assertion failure after writing to journaled file, umountRobert Peterson
This patch passes all my nasty tests that were causing the code to fail under one circumstance or another. Here is a complete summary of all changes from today's git tree, in order of appearance: 1. There are now separate variables for metadata buffer accounting. 2. Variable sd_log_num_hdrs is no longer needed, since the header accounting is taken care of by the reserve/refund sequence. 3. Fixed a tiny grammatical problem in a comment. 4. Added a new function "calc_reserved" to calculate the reserved log space. This isn't entirely necessary, but it has two benefits: First, it simplifies the gfs2_log_refund function greatly. Second, it allows for easier debugging because I could sprinkle the code with calls to this function to make sure the accounting is proper (by adding asserts and printks) at strategic point of the code. 5. In log_pull_tail there apparently was a kludge to fix up the accounting based on a "pull" parameter. The buffer accounting is now done properly, so the kludge was removed. 6. File sync operations were making a call to gfs2_log_flush that writes another journal header. Since that header was unplanned for (reserved) by the reserve/refund sequence, the free space had to be decremented so that when log_pull_tail gets called, the free space is be adjusted properly. (Did I hear you call that a kludge? well, maybe, but a lot more justifiable than the one I removed). 7. In the gfs2_log_shutdown code, it optionally syncs the log by specifying the PULL parameter to log_write_header. I'm not sure this is necessary anymore. It just seems to me there could be cases where shutdown is called while there are outstanding log buffers. 8. In the (data)buf_lo_before_commit functions, I changed some offset values from being calculated on the fly to being constants. That simplified some code and we might as well let the compiler do the calculation once rather than redoing those cycles at run time. 9. This version has my rewritten databuf_lo_add function. This version is much more like its predecessor, buf_lo_add, which makes it easier to understand. Again, this might not be necessary, but it seems as if this one works as well as the previous one, maybe even better, so I decided to leave it in. 10. In databuf_lo_before_commit, a previous data corruption problem was caused by going off the end of the buffer. The proper solution is to have the proper limit in place, rather than stopping earlier. (Thus my previous attempt to fix it is wrong). If you don't wrap the buffer, you're stopping too early and that causes more log buffer accounting problems. 11. In lops.h there are two new (previously mentioned) constants for figuring out the data offset for the journal buffers. 12. There are also two new functions, buf_limit and databuf_limit to calculate how many entries will fit in the buffer. 13. In function gfs2_meta_wipe, it needs to distinguish between pinned metadata buffers and journaled data buffers for proper journal buffer accounting. It can't use the JDATA gfs2_inode flag because it's sometimes passed the "real" inode and sometimes the "metadata inode" and the inode flags will be random bits in a metadata gfs2_inode. It needs to base its decision on which was passed in. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09[GFS2] Journaled file write/unstuff bugRobert Peterson
This patch is for bugzilla bug 283162, which uncovered a number of bugs pertaining to writing to files that have the journaled bit on. These bugs happen most often when writing to the meta_fs because the files are always journaled. So operations like gfs2_grow were particularly vulnerable, although many of the problems could be recreated with normal files after setting the journaled bit on. The problems fixed are: -GFS2 wasn't ever writing unstuffed journaled data blocks to their in-place location on disk. Now it does. -If you unmounted too quickly after doing IO to a journaled file, GFS2 was crashing because you would discard a buffer whose bufdata was still on the active items list. GFS2 now deals with this gracefully. -GFS2 was losing track of the bufdata for journaled data blocks, and it wasn't getting freed, causing an error when you tried to unmount the module. GFS2 now frees all the bufdata structures. -There was a memory corruption occurring because GFS2 wrote twice as many log entries for journaled buffers. -It was occasionally trying to write journal headers in buffers that weren't currently mapped. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09[GFS2] fix jdata issuesBenjamin Marzinski
This is a patch for the first three issues of RHBZ #238162 The first issue is that when you allocate a new page for a file, it will not start off uptodate. This makes sense, since you haven't written anything to that part of the file yet. Unfortunately, gfs2_pin() checks to make sure that the buffers are uptodate. The solution to this is to mark the buffers uptodate in gfs2_commit_write(), after they have been zeroed out and have the data written into them. I'm pretty confident with this fix, although it's not completely obvious that there is no problem with marking the buffers uptodate here. The second issue is simply that you can try to pin a data buffer that is already on the incore log, and thus, already pinned. This patch checks to see if this buffer is already on the log, and exits databuf_lo_add() if it is, just like buf_lo_add() does. The third issue is that gfs2_log_flush() doesn't do it's block accounting correctly. Both metadata and journaled data are logged, but gfs2_log_flush() only compares the number of metadata blocks with the number of blocks to commit to the ondisk journal. This patch also counts the journaled data blocks. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-05-01[GFS2] Fix log entry list corruptionBenjamin Marzinski
When glock_lo_add and rg_lo_add attempt to add an element to the log, they check to see if has already been added before locking the log. If another process adds that element to the log in this window between the check and locking the log, the element will be added to the list twice. This causes the log element list to become corrupted in such a way that the log element can never be successfully removed from the list. This patch pulls the list_empty() check inside the log lock, to remove this window. Signed-off-by: Benjamin E. Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05[GFS2] Fix list corruption in lops.cSteven Whitehouse
The patch below appears to fix the list corruption that we are seeing on occasion. Although the transaction structure is private to a single thread, when the queued structures are dismantled during an in-core commit, its possible for a different thread to be trying to add the same structure to another, new, transaction at the same time. To avoid this, this patch takes the log spinlock during this operation. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Fix race in logging codeRussell Cattelan
The log lock is dropped prior to io submittion, but this exposes a hole in which the log data structures may be going away due to a truncate. Store the buffer head in a local pointer prior to dropping the lock and relay on the buffer_head lock for consitency on the buffer head. Signed-Off-By: Russell Cattelan <cattelan@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] split and annotate gfs2_log_headAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-12[GFS2] Pass the correct value to kunmap_atomicRussell Cattelan
Pass kaddr rather than (incorrect) struct page to kunmap_atomic. Signed-off-by: Russell Cattelan <cattelan@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-25[GFS2/DLM] Fix trailing whitespaceSteven Whitehouse
As per Andrew Morton's request, removed trailing whitespace. Cc: Andrew Morton <akpm@osdl.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-21[GFS2] Tidy up meta_io codeSteven Whitehouse
Fix a bug in the directory reading code, where we might have dereferenced a NULL pointer in case of OOM. Updated the directory code to use the new & improved version of gfs2_meta_ra() which now returns the first block that was being read. Previously it was releasing it requiring following code to grab the block again at each point it was called. Also turned off readahead on directory lookups since we are reading a hash table, and therefore reading the entries in order is very unlikely. Readahead is still used for all other calls to the directory reading function (e.g. when growing the hash table). Removed the DIO_START constant. Everywhere this was used, it was used to unconditionally start i/o aside from a couple of places, so I've removed it and made the couple of exceptions to this rule into separate functions. Also hunted through the other DIO flags and removed them as arguments from functions which were always called with the same combination of arguments. Updated gfs2_meta_indirect_buffer to be a bit more efficient and hopefully also be a bit easier to read. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-19[GFS2] Export lm_interface to kernel headersFabio Massimo Di Nitto
lm_interface.h has a few out of the tree clients such as GFS1 and userland tools. Right now, these clients keeps a copy of the file in their build tree that can go out of sync. Move lm_interface.h to include/linux, export it to userland and clean up fs/gfs2 to use the new location. Signed-off-by: Fabio M. Di Nitto <fabbione@ubuntu.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-05[GFS2] Style changes in logging codeSteven Whitehouse
As per Jan Engelhardt's comments, removed some unused code and removed some brackets which were not required. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-05[GFS2] Fix end of multi-line structuresSteven Whitehouse
As per Jan Engelhardt's request, I've added a ',' to the end of each of the multi-line structures which didn't already have one (most already did). Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-04[GFS2] More style changesSteven Whitehouse
As per Jan Engelhardt's fourth email, this is the first part of the change set with a few minor style points. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-04[GFS2] Change all types to uX styleSteven Whitehouse
This makes all fixed size types have consistent names. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-01[GFS2] Update copyright, tidy up incore.hSteven Whitehouse
As per comments from Jan Engelhardt <jengelh@linux01.gwdg.de> this updates the copyright message to say "version" in full rather than "v.2". Also incore.h has been updated to remove forward structure declarations which are not required. The gfs2_quota_lvb structure has now had endianess annotations added to it. Also quota.c has been updated so that we now store the lvb data locally in endian independant format to avoid needing a structure in host endianess too. As a result the endianess conversions are done as required at various points and thus the conversion routines in lvb.[ch] are no longer required. I've moved the one remaining constant in lvb.h thats used into lm.h and removed the unused lvb.[ch]. I have not changed the HIF_ constants. That is left to a later patch which I hope will unify the gh_flags and gh_iflags fields of the struct gfs2_holder. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-08-31[GFS2] Fix releasepage bug (fixes direct i/o writes)Steven Whitehouse
This patch fixes three main bugs. Firstly the direct i/o get_block was returning the wrong return code in certain cases. Secondly, the GFS2's releasepage function was not dealing with cases when clean, ordered buffers were found still queued on a transaction (which can happen depending on the ordering of journal flushes). Thirdly, the journaling code itself needed altering to take account of the after effects of removing the clean ordered buffers from the transactions before a journal flush. The releasepage bug did also show up under "normal" buffered i/o as well, so its not just a fix for direct i/o. In fact its not normally used in the direct i/o path at all, except when flushing existing buffers after performing a direct i/o write, but that was the code path that led us to spot this. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-08-22[GFS2] Another list_del bugSteven Whitehouse
Another case where list_del should be list_del_init. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-08-22[GFS2] Fix to list_del in lops.cSteven Whitehouse
A list_del should have been a list_del_init in lops.c which was resulting in incorrect status returns from list_empty(). Signed-off-by: Steven Whitheouse <swhiteho@redhat.com>
2006-08-18[GFS2] Fix leak of gfs2_bufdataSteven Whitehouse
This fixes a memory leak of struct gfs2_bufdata and also some problems in the ordered write handling code. It needs a bit more testing, but I believe that the reference counting of ordered write buffers should now be correct. This is aimed at fixing Red Hat bugzilla: #201028 and #201082 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-06-19[GFS2] Always include glock in transactionSteven Whitehouse
Include the glock in the transaction, even when not journaling data in order that ordered write data will be correctly flushed when the lock is released. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-06-19[GFS2] Remove debugging printksSteven Whitehouse
A few of my printks slipped through last time. Also fix a couple of minor bugs. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-06-14[GFS2] Fix unlinked file handlingSteven Whitehouse
This patch fixes the way we have been dealing with unlinked, but still open files. It removes all limits (other than memory for inodes, as per every other filesystem) on numbers of these which we can support on GFS2. It also means that (like other fs) its the responsibility of the last process to close the file to deallocate the storage, rather than the person who did the unlinking. Note that with GFS2, those two events might take place on different nodes. Also there are a number of other changes: o We use the Linux inode subsystem as it was intended to be used, wrt allocating GFS2 inodes o The Linux inode cache is now the point which we use for local enforcement of only holding one copy of the inode in core at once (previous to this we used the glock layer). o We no longer use the unlinked "special" file. We just ignore it completely. This makes unlinking more efficient. o We now use the 4th block allocation state. The previously unused state is used to track unlinked but still open inodes. o gfs2_inoded is no longer needed o Several fields are now no longer needed (and removed) from the in core struct gfs2_inode o Several fields are no longer needed (and removed) from the in core superblock There are a number of future possible optimisations and clean ups which have been made possible by this patch. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-18[GFS2] Update copyright date to 2006Steven Whitehouse
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-18[GFS2] Remove semaphore.h from C filesSteven Whitehouse
We no longer use semaphores, everything has been converted to mutex or rwsem, so we don't need to include this header any more. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-07[GFS2] Fix a ref count bug and other clean upsSteven Whitehouse
This fixes a ref count bug that sometimes showed up a umount time (causing it to hang) but it otherwise mostly harmless. At the same time there are some clean ups including making the log operations structures const, moving a memory allocation so that its not done in the fast path of checking to see if there is an outstanding transaction related to a particular glock. Removes the sd_log_wrap varaible which was updated, but never actually used anywhere. Updates the gfs2 ioctl() to run without the kernel lock (which it never needed anyway). Removes the "invalidate inodes" loop from GFS2's put_super routine. This is done in kill super anyway so we don't need to do it here. The loop was also bogus in that if there are any inodes "stuck" at this point its a bug and we need to know about it rather than hide it by hanging forever. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-30[GFS] Fix bug in endian conversion for metadata headerSteven Whitehouse
In some cases 16 bit functions were being used rather than 32 bit functions. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-01[GFS2] Fix some bugsSteven Whitehouse
Fix a bug I introduced earlier with a kfree() and usage of a structure in the wrong order. Also try and get the counts of the journaled data buffers "more correct". Still some work to do in this area though. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-27[GFS2] Macros removal in gfs2.hSteven Whitehouse
As suggested by Pekka Enberg <penberg@cs.helsinki.fi>. The DIV_RU macro is renamed DIV_ROUND_UP and and moved to kernel.h The other macros are gone from gfs2.h as (although not requested by Pekka Enberg) are a number of included header file which are now included individually. The inode number comparison function is now an inline function. The DT2IF and IF2DT may be addressed in a future patch. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-27[GFS2] 80 Column audit of GFS2Steven Whitehouse
Requested by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-22[GFS2] Missed deletion of debugging codeSteven Whitehouse
One line which should have been deleted in the last patch was missed. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-22[GFS2] Add list empty test to databuf_lo_addSteven Whitehouse
Heinz had spotted that I'd forgotten to test in databuf_lo_add() that the data buffer in question hadn't already been added to the list. This was causing an infinite loop later on in the "before commit" routine. This means that GFS2 is now ready to be tested by everybody. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-21[GFS2] Use mutices rather than semaphoresSteven Whitehouse
As well as a number of minor bug fixes, this patch changes GFS to use mutices rather than semaphores. This results in better information in case there are any locking problems. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-13[GFS2] Fix for root inode ref count bugSteven Whitehouse
Umount is now working correctly again. The bug was due to not getting an extra ref count when mounting the fs. We should have bumped it by two (once for the internal pointer to the root inode from the super block and once for the inode hanging off the dcache entry for root). Also this patch tidys up the code dealing with looking up and creating inodes. We now pass Linux inodes (with gfs2_inodes attached) rather than the other way around and this reduces code duplication in various places. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-08[GFS2] Make journaled data files identical to normal files on diskSteven Whitehouse
This is a very large patch, with a few still to be resolved issues so you might want to check out the previous head of the tree since this is known to be unstable. Fixes for the various bugs will be forthcoming shortly. This patch removes the special data format which has been used up till now for journaled data files. Directories still retain the old format so that they will remain on disk compatible with earlier releases. As a result you can now do the following with journaled data files: 1) mmap them 2) export them over NFS 3) convert to/from normal files whenever you want to (the zero length restriction is gone) In addition the level at which GFS' locking is done has changed for all files (since they all now use the page cache) such that the locking is done at the page cache level rather than the level of the fs operations. This should mean that things like loopback mounts and other things which touch the page cache directly should now work. Current known issues: 1. There is a lock mode inversion problem related to the resource group hold function which needs to be resolved. 2. Any significant amount of I/O causes an oops with an offset of hex 320 (NULL pointer dereference) which appears to be related to a journaled data buffer appearing on a list where it shouldn't be. 3. Direct I/O writes are disabled for the time being (will reappear later) 4. There is probably a deadlock between the page lock and GFS' locks under certain combinations of mmap and fs operation I/O. 5. Issue relating to ref counting on internally used inodes causes a hang on umount (discovered before this patch, and not fixed by it) 6. One part of the directory metadata is different from GFS1 and will need to be resolved before next release. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-18[GFS2] Rename gfs2_meta_pin to gfs2_pinSteven Whitehouse
Since we'll need to pin data if we are going to journal it, then I'm renaming this function to make it less confusing. It might also be worth moving it into lops.c since there are no users outside that file. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-18[GFS2] Remove gfs2_databuf in favour of gfs2_bufdata structureSteven Whitehouse
Removing the gfs2_databuf structure and using gfs2_bufdata instead is a step towards allowing journaling of data without requiring the metadata header on each journaled block. The idea is to merge the code paths for ordered data with that of journaled data, with the log operations in lops.c tacking account of the different types of buffers as they are presented to it. Largely the code path for metadata will be similar too, but obviously through a different set of log operations. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-16[GFS2] The core of GFS2David Teigland
This patch contains all the core files for GFS2. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>