aboutsummaryrefslogtreecommitdiff
path: root/fs/gfs2/inode.h
AgeCommit message (Collapse)Author
2009-03-24GFS2: Merge lock_dlm module into GFS2Steven Whitehouse
This is the big patch that I've been working on for some time now. There are many reasons for wanting to make this change such as: o Reducing overhead by eliminating duplicated fields between structures o Simplifcation of the code (reduces the code size by a fair bit) o The locking interface is now the DLM interface itself as proposed some time ago. o Fewer lookups of glocks when processing replies from the DLM o Fewer memory allocations/deallocations for each glock o Scope to do further optimisations in the future (but this patch is more than big enough for now!) Please note that (a) this patch relates to the lock_dlm module and not the DLM itself, that is still a separate module; and (b) that we retain the ability to build GFS2 as a standalone single node filesystem with out requiring the DLM. This patch needs a lot of testing, hence my keeping it I restarted my -git tree after the last merge window. That way, this has the maximum exposure before its merged. This is (modulo a few minor bug fixes) the same patch that I've been posting on and off the the last three months and its passed a number of different tests so far. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-01-05GFS2: Banish struct gfs2_dinode_hostSteven Whitehouse
The final field in gfs2_dinode_host was the i_flags field. Thats renamed to i_diskflags in order to avoid confusion with the existing inode flags, and moved into the inode proper at a suitable location to avoid creating a "hole". At that point struct gfs2_dinode_host is no longer needed and as promised (quite some time ago!) it can now be removed completely. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-01-05GFS2: Rationalise header filesSteven Whitehouse
Move the contents of some headers which contained very little into more sensible places, and remove the original header files. This should make it easier to find things. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-09-18GFS2: high time to take some time over atimeSteven Whitehouse
Until now, we've used the same scheme as GFS1 for atime. This has failed since atime is a per vfsmnt flag, not a per fs flag and as such the "noatime" flag was not getting passed down to the filesystems. This patch removes all the "special casing" around atime updates and we simply use the VFS's atime code. The net result is that GFS2 will now support all the same atime related mount options of any other filesystem on a per-vfsmnt basis. We do lose the "lazy atime" updates, but we gain "relatime". We could add lazy atime to the VFS at a later date, if there is a requirement for that variant still - I suspect relatime will be enough. Also we lose about 100 lines of code after this patch has been applied, and I have a suspicion that it will speed things up a bit, even when atime is "on". So it seems like a nice clean up as well. From a user perspective, everything stays the same except the loss of the per-fs atime quantum tweekable (ought to be per-vfsmnt at the very least, and to be honest I don't think anybody ever used it) and that a number of options which were ignored before now work correctly. Please let me know if you've got any comments. I'm pushing this out early so that you can all see what my plans are. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-08-27GFS2: Fix & clean up GFS2 renameSteven Whitehouse
This patch fixes a locking issue in the rename code by ensuring that we hold the per sb rename lock over both directory and "other" renames which involve different parent directories. At the same time, this moved the (only called from one place) function gfs2_ok_to_move into the file that its called from, so we can mark it static. This should make a code a bit easier to follow. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Peter Staubach <staubach@redhat.com>
2008-07-26[PATCH] don't pass nameidata to gfs2_lookupi()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-10[GFS2] Remove unused declarationLi Xiaodong
The implementation of gfs2_inode_attr_in is removed. So remove its declaration. Signed-off-by: Li Xiaodong <lixd@cn.fujitsu.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-07-03[GFS2] don't call permission()Miklos Szeredi
GFS2 calls permission() to verify permissions after locks on the files have been taken. For this it's sufficient to call gfs2_permission() instead. This results in the following changes: - IS_RDONLY() check is not performed - IS_IMMUTABLE() check is not performed - devcgroup_inode_permission() is not called - security_inode_permission() is not called IS_RDONLY() should be unnecessary anyway, as the per-mount read-only flag should provide protection against read-only remounts during operations. do_gfs2_set_flags() has been fixed to perform mnt_want_write()/mnt_drop_write() to protect against remounting read-only. IS_IMMUTABLE has been added to gfs2_permission() Repeating the security checks seems to be pointless, as they don't normally change, and if they do, it's independent of the filesystem state. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31[GFS2] Eliminate (almost) duplicate field from gfs2_inodeSteven Whitehouse
The blocks counter is almost a duplicate of the i_blocks field in the VFS inode. The only difference is that i_blocks can be only 32bits long for 32bit arch without large single file support. Since GFS2 doesn't handle the non-large single file case (for 32 bit anyway) this adds a new config dependency on 64BIT || LSF. This has always been the case, however we've never explicitly said so before. Even if we do add support for the non-LSF case, we will still not require this field to be duplicated since we will not be able to access oversized files anyway. So the net result of all this is that we shave 8 bytes from a gfs2_inode and get our config deps correct. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31[GFS2] Streamline indirect pointer tree height calculationSteven Whitehouse
This patch improves the calculation of the tree height in order to reduce the number of operations which are carried out on each call to gfs2_block_map. In the common case, we now make a single comparison, rather than calculating the required tree height from scratch each time. Also in the case that the tree does need some extra height, we start from the current height rather from zero when we work out what the new height ought to be. In addition the di_height field is moved into the inode proper and reduced in size to a u8 since the value must be between 0 and GFS2_MAX_META_HEIGHT (10). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25[GFS2] Introduce gfs2_set_aops()Steven Whitehouse
Just like ext3 we now have three sets of address space operations to cover the cases of writeback, ordered and journalled data writes. This means that the individual operations can now become less complicated as we are able to remove some of the tests for file data mode from the code. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25[GFS2] Add gfs2_is_writeback()Steven Whitehouse
This adds a function "gfs2_is_writeback()" along the lines of the existing "gfs2_is_jdata()" in order to clean up the code and make the various tests for the inode mode more obvious. It also fixes the PageChecked() logic where we were resetting the flag too early in the case of an error path. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10[GFS2] Alternate gfs2_iget to avoid looking up inodes being freedBenjamin Marzinski
There is a possible deadlock between two processes on the same node, where one process is deleting an inode, and another process is looking for allocated but unused inodes to delete in order to create more space. process A does an iput() on inode X, and it's i_count drops to 0. This causes iput_final() to be called, which puts an inode into state I_FREEING at generic_delete_inode(). There no point between when iput_final() is called, and when I_FREEING is set where GFS2 could acquire any glocks. Once I_FREEING is set, no other process on that node can successfully look up that inode until the delete finishes. process B locks the the resource group for the same inode in get_local_rgrp(), which is called by gfs2_inplace_reserve_i() process A tries to lock the resource group for the inode in gfs2_dinode_dealloc(), but it's already locked by process B process B waits in find_inode for the inode to have the I_FREEING state cleared. Deadlock. This patch solves the problem by adding an alternative to gfs2_iget(), gfs2_iget_skip(), that simply skips any inodes that are in the I_FREEING state.o The alternate test function is just like the original one, except that it fails if the inode is being freed, and sets a skipped flag. The alternate set function is just like the original, except that it fails if the skipped flag is set. Only try_rgrp_unlink() calls gfs2_iget_skip() instead of gfs2_iget(). Signed-off-by: Benjamin E. Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09[GFS2] Remove i_mode passing from NFS File HandleWendy Cheng
GFS2 has been passing i_mode within NFS File Handle. Other than the wrong assumption that there is always room for this extra 16 bit value, the current gfs2_get_dentry doesn't really need the i_mode to work correctly. Note that GFS2 NFS code does go thru the same lookup code path as direct file access route (where the mode is obtained from name lookup) but gfs2_get_dentry() is coded for different purpose. It is not used during lookup time. It is part of the file access procedure call. When the call is invoked, if on-disk inode is not in-memory, it has to be read-in. This makes i_mode passing a useless overhead. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09[GFS2] Obtaining no_formal_ino from directory entryWendy Cheng
GFS2 lookup code doesn't ask for inode shared glock. This implies during in-memory inode creation for existing file, GFS2 will not disk-read in the inode contents. This leaves no_formal_ino un-initialized during lookup time. The un-initialized no_formal_ino is subsequently encoded into file handle. Clients will get ESTALE error whenever it tries to access these files. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09[GFS2] Fix sign problem in quota/statfs and cleanup _host structuresSteven Whitehouse
This patch fixes some sign issues which were accidentally introduced into the quota & statfs code during the endianess annotation process. Also included is a general clean up which moves all of the _host structures out of gfs2_ondisk.h (where they should not have been to start with) and into the places where they are actually used (often only one place). Also those _host structures which are not required any more are removed entirely (which is the eventual plan for all of them). The conversion routines from ondisk.c are also moved into the places where they are actually used, which for almost every one, was just one single place, so all those are now static functions. This also cleans up the end of gfs2_ondisk.h which no longer needs the #ifdef __KERNEL__. The net result is a reduction of about 100 lines of code, many functions now marked static plus the bug fixes as mentioned above. For good measure I ran the code through sparse after making these changes to check that there are no warnings generated. This fixes Red Hat bz #239686 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09[GFS2] Clean up inode number handlingSteven Whitehouse
This patch cleans up the inode number handling code. The main difference is that instead of looking up the inodes using a struct gfs2_inum_host we now use just the no_addr member of this structure. The tests relating to no_formal_ino can then be done by the calling code. This has advantages in that we want to do different things in different code paths if the no_formal_ino doesn't match. In the NFS patch we want to return -ESTALE, but in the ->lookup() path, its a bug in the fs if the no_formal_ino doesn't match and thus we can withdraw in this case. In order to later fix bz #201012, we need to be able to look up an inode without knowing no_formal_ino, as the only information that is known to us is the on-disk location of the inode in question. This patch will also help us to fix bz #236099 at a later date by cleaning up a lot of the code in that area. There are no user visible changes as a result of this patch and there are no changes to the on-disk format either. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Remove unused function from inode.cSteven Whitehouse
The gfs2_glock_nq_m_atime function is unused in so far as its only ever called with num_gh = 1, and this falls through to the gfs2_glock_nq_atime function, so we might as well call that directly. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Remove gfs2_inode_attr_inSteven Whitehouse
This function wasn't really doing the right thing. There was no need to update the inode size at this point and the updating of the i_blocks field has now been moved to the places where di_blocks is updated. A result of this patch and some those preceeding it is that unlocking a glock is now a much more efficient process, since there is no longer any requirement to copy data from the gfs2 inode into the vfs inode at this point. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Shrink gfs2_inode (6) - di_atime/di_mtime/di_ctimeSteven Whitehouse
Remove the di_[amc]time fields and use inode->i_[amc]time fields instead. This saves 24 bytes from the gfs2_inode. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Shrink gfs2_inode (3) - di_modeSteven Whitehouse
This removes the duplicate di_mode field in favour of using the inode->i_mode field. This saves 4 bytes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Shrink gfs2_inode (2) - di_major/di_minorSteven Whitehouse
This removes the device numbers from this structure by using inode->i_rdev instead. It also cleans up the code in gfs2_mknod. It results in shrinking the gfs2_inode by 8 bytes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] split and annotate gfs2_inumAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-25[GFS2/DLM] Fix trailing whitespaceSteven Whitehouse
As per Andrew Morton's request, removed trailing whitespace. Cc: Andrew Morton <akpm@osdl.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-01[GFS2] Update copyright, tidy up incore.hSteven Whitehouse
As per comments from Jan Engelhardt <jengelh@linux01.gwdg.de> this updates the copyright message to say "version" in full rather than "v.2". Also incore.h has been updated to remove forward structure declarations which are not required. The gfs2_quota_lvb structure has now had endianess annotations added to it. Also quota.c has been updated so that we now store the lvb data locally in endian independant format to avoid needing a structure in host endianess too. As a result the endianess conversions are done as required at various points and thus the conversion routines in lvb.[ch] are no longer required. I've moved the one remaining constant in lvb.h thats used into lm.h and removed the unused lvb.[ch]. I have not changed the HIF_ constants. That is left to a later patch which I hope will unify the gh_flags and gh_iflags fields of the struct gfs2_holder. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-06-22[GFS2] Remove gfs2_repermissionSteven Whitehouse
gfs2_repermission is just a wrapper for permission, so remove it and call permission directly where required. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-06-14[GFS2] Fix unlinked file handlingSteven Whitehouse
This patch fixes the way we have been dealing with unlinked, but still open files. It removes all limits (other than memory for inodes, as per every other filesystem) on numbers of these which we can support on GFS2. It also means that (like other fs) its the responsibility of the last process to close the file to deallocate the storage, rather than the person who did the unlinking. Note that with GFS2, those two events might take place on different nodes. Also there are a number of other changes: o We use the Linux inode subsystem as it was intended to be used, wrt allocating GFS2 inodes o The Linux inode cache is now the point which we use for local enforcement of only holding one copy of the inode in core at once (previous to this we used the glock layer). o We no longer use the unlinked "special" file. We just ignore it completely. This makes unlinking more efficient. o We now use the 4th block allocation state. The previously unused state is used to track unlinked but still open inodes. o gfs2_inoded is no longer needed o Several fields are now no longer needed (and removed) from the in core struct gfs2_inode o Several fields are no longer needed (and removed) from the in core superblock There are a number of future possible optimisations and clean ups which have been made possible by this patch. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-18[GFS2] Update copyright date to 2006Steven Whitehouse
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-28[GFS2] Reordering in deallocation to avoid recursive lockingSteven Whitehouse
Despite my earlier careful search, there was a recursive lock left in the deallocation code. This removes it. It also should speed up deallocation be reducing the number of locking operations which take place by using two "try lock" operations on the two locks involved in inode deallocation which allows us to grab the locks out of order (compared with NFS which grabs the inode lock first and the iopen lock later). It is ok for us to fail while doing this since if it does fail it means that someone else is still using the inode and thus it wouldn't be possible to deallocate anyway. This fixes the bug reported to me by Rob Kenna. Cc: Rob Kenna <rkenna@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-20[GFS2] Fix bug in directory code and tidy upSteven Whitehouse
Due to a typo, the dir leaf split operation was (for the first split in a directory) writing the new hash vaules at the wrong offset. This is now fixed. Also some other tidy ups are included: - We use GFS2's hash function for dentries (see ops_dentry.c) so that we don't have to keep recalculating the hash values. - A lot of common code is eliminated between the various directory lookup routines. - Better error checking on directory lookup (previously different routines checked for different errors) - The leaf split operation has a couple of redundant operations removed from it, so it should be faster. There is still further scope for further clean ups in the directory code, and readdir in particular could do with slimming down a bit. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-01[GFS2] Tidy up mount code.Steven Whitehouse
We no longer lookup ".gfs2_admin" in the root directory in order to find it, but instead use the inode number given in the superblock. Both the root directory and the admin directory are now looked up using the same routine, so the redundant code is removed. Also, there is no longer a reference to the root inode in the GFS2 super block. When required this can be retreived via sb->s_root->d_inode instead. Assuming that we introduce a metadata filesystem type for GFS, then this is a first step towards that goal. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-27[GFS2] 80 Column audit of GFS2Steven Whitehouse
Requested by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-13[GFS2] Fix for root inode ref count bugSteven Whitehouse
Umount is now working correctly again. The bug was due to not getting an extra ref count when mounting the fs. We should have bumped it by two (once for the internal pointer to the root inode from the super block and once for the inode hanging off the dcache entry for root). Also this patch tidys up the code dealing with looking up and creating inodes. We now pass Linux inodes (with gfs2_inodes attached) rather than the other way around and this reduces code duplication in various places. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-08[GFS2] Make journaled data files identical to normal files on diskSteven Whitehouse
This is a very large patch, with a few still to be resolved issues so you might want to check out the previous head of the tree since this is known to be unstable. Fixes for the various bugs will be forthcoming shortly. This patch removes the special data format which has been used up till now for journaled data files. Directories still retain the old format so that they will remain on disk compatible with earlier releases. As a result you can now do the following with journaled data files: 1) mmap them 2) export them over NFS 3) convert to/from normal files whenever you want to (the zero length restriction is gone) In addition the level at which GFS' locking is done has changed for all files (since they all now use the page cache) such that the locking is done at the page cache level rather than the level of the fs operations. This should mean that things like loopback mounts and other things which touch the page cache directly should now work. Current known issues: 1. There is a lock mode inversion problem related to the resource group hold function which needs to be resolved. 2. Any significant amount of I/O causes an oops with an offset of hex 320 (NULL pointer dereference) which appears to be related to a journaled data buffer appearing on a list where it shouldn't be. 3. Direct I/O writes are disabled for the time being (will reappear later) 4. There is probably a deadlock between the page lock and GFS' locks under certain combinations of mmap and fs operation I/O. 5. Issue relating to ref counting on internally used inodes causes a hang on umount (discovered before this patch, and not fixed by it) 6. One part of the directory metadata is different from GFS1 and will need to be resolved before next release. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-30[GFS2] Add gfs2_internal_read()Steven Whitehouse
Add the new external read function. Its temporarily in jdata.c even though the protoype is in ops_file.h - this will change shortly. The current implementation will change to a page cache one when that happens. In order to effect the above changes, the various internal inodes now have Linux inodes attached to them. We keep the references to the Linux inodes, rather than the gfs2_inodes in the super block. In order to get everything to work correctly I've had to reorder the init sequence on mount (which I should probably have done earlier when .gfs2_admin was made visible). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-16[GFS2] The core of GFS2David Teigland
This patch contains all the core files for GFS2. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>