aboutsummaryrefslogtreecommitdiff
path: root/fs/nfs
AgeCommit message (Collapse)Author
2007-07-10NFSv4: Make sure unlock is really an unlock when cancelling a lockFrank Filz
I ran into a curious issue when a lock is being canceled. The cancellation results in a lock request to the vfs layer instead of an unlock request. This is particularly insidious when the process that owns the lock is exiting. In that case, sometimes the erroneous lock is applied AFTER the process has entered zombie state, preventing the lock from ever being released. Eventually other processes block on the lock causing a slow degredation of the system. In the 2.6.16 kernel this was investigated on, the problem is compounded by the fact that the cl_sem is held while blocking on the vfs lock, which results in most processes accessing the nfs file system in question hanging. In more detail, here is how the situation occurs: first _nfs4_do_setlk(): static int _nfs4_do_setlk(struct nfs4_state *state, int cmd, struct file_lock *fl, int reclaim) ... ret = nfs4_wait_for_completion_rpc_task(task); if (ret == 0) { ... } else data->cancelled = 1; then nfs4_lock_release(): static void nfs4_lock_release(void *calldata) ... if (data->cancelled != 0) { struct rpc_task *task; task = nfs4_do_unlck(&data->fl, data->ctx, data->lsp, data->arg.lock_seqid); The problem is the same file_lock that was passed in to _nfs4_do_setlk() gets passed to nfs4_do_unlck() from nfs4_lock_release(). So the type is still F_RDLCK or FWRLCK, not F_UNLCK. At some point, when cancelling the lock, the type needs to be changed to F_UNLCK. It seemed easiest to do that in nfs4_do_unlck(), but it could be done in nfs4_lock_release(). The concern I had with doing it there was if something still needed the original file_lock, though it turns out the original file_lock still needs to be modified by nfs4_do_unlck() because nfs4_do_unlck() uses the original file_lock to pass to the vfs layer, and a copy of the original file_lock for the RPC request. It seems like the simplest solution is to force all situations where nfs4_do_unlck() is being used to result in an unlock, so with that in mind, I made the following change: Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: Make the NFS state model work with the nosharedcache mount optionTrond Myklebust
Consider the case where the user has mounted the remote filesystem server:/foo on the two local directories /bar and /baz using the nosharedcache mount option. The files /bar/file and /baz/file are represented by different inodes in the local namespace, but refer to the same file /foo/file on the server. Consider the case where a process opens both /bar/file and /baz/file, then closes /bar/file: because the nfs4_state is not shared between /bar/file and /baz/file, the kernel will see that the nfs4_state for /bar/file is no longer referenced, so it will send off a CLOSE rpc call. Unless the open_owners differ, then that CLOSE call will invalidate the open state on /baz/file too. Conclusion: we cannot share open state owners between two different non-shared mount instances of the same filesystem. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFS: Error when mounting the same filesystem with different optionsTrond Myklebust
Unless the user sets the NFS_MOUNT_NOSHAREDCACHE mount flag, we should return EBUSY if the filesystem is already mounted on a superblock that has set conflicting mount options. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFS: Add the mount option "nosharecache"Trond Myklebust
Prior to David Howell's mount changes in 2.6.18, users who mounted different directories which happened to be from the same filesystem on the server would get different super blocks, and hence could choose different mount options. As long as there were no hard linked files that crossed from one subtree to another, this was quite safe. Post the changes, if the two directories are on the same filesystem (have the same 'fsid'), they will share the same super block, and hence the same mount options. Add a flag to allow users to elect not to share the NFS super block with another mount point, even if the fsids are the same. This will allow users to set different mount options for the two different super blocks, as was previously possible. It is still up to the user to ensure that there are no cache coherency issues when doing this, however the default behaviour will be to share super blocks whenever two paths result in the same fsid. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFS: Add support for mounting NFSv4 file systems with string optionsChuck Lever
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFS: Add final pieces to support in-kernel mount option parsingChuck Lever
Hook in final components required for supporting in-kernel mount option parsing for NFSv2 and NFSv3 mounts. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFS: Introduce generic mount client APIChuck Lever
For NFSv2 and v3 mounts, the first step is to contact the server's MOUNTD and request the file handle for the root of the mounted share. Add a function to the NFS client that handles this operation. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFS: Add enums and match tables for mount option parsingChuck Lever
This generic infrastructure works for both NFS and NFSv4 mounts. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFS: Improve debugging output in NFS in-kernel mount clientChuck Lever
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFS: Clean up in-kernel NFS mountChuck Lever
Clean up white space and coding conventions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFS: Remake nfsroot_mount as a permanent part of NFS clientChuck Lever
In preparation for supporting NFSv2 and NFSv3 mount option handling in the kernel NFS client, convert mount_clnt.c to be a permanent part of the NFS client, instead of built only when CONFIG_ROOT_NFS is enabled. In addition, we also replace the "struct sockaddr_in *" argument with something more generic, to help support IPv6 at some later point. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10SUNRPC: Add a convenient default for the hostname when calling rpc_create()Chuck Lever
A couple of callers just use a stringified IP address for the rpc client's hostname. Move the logic for constructing this into rpc_create(), so it can be shared. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10SUNRPC: Rename rpcb_getport_external routineChuck Lever
In preparation for handling NFS mount option parsing in the kernel, rename rpcb_getport_external as rpcb_get_port_sync, and make it available always (instead of only when CONFIG_ROOT_NFS is enabled). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFS: Introduce nfs4_validate_mount_optionsChuck Lever
Refactor NFSv4 mount processing to break out mount data validation in the same way it's broken out in the NFSv2/v3 mount path. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFS: Clean up nfs_validate_mount_dataChuck Lever
Move error handling code out of the main code path. The switch statement was also improperly indented, according to Documentation/CodingStyle. This prepares nfs_validate_mount_data for the addition of option string parsing. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFS: Clean-up: Refactor IP address sanity checks in NFS clientChuck Lever
NFS and NFSv4 mounts can now share server address sanity checking. And, it provides an easy mechanism for adding IPv6 address checking at some later point. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFS: Clean-up: fix a compiler warning in fs/nfs/super.cChuck Lever
/home/cel/linux/fs/nfs/super.c: In function 'nfs_pseudoflavour_to_name': /home/cel/linux/fs/nfs/super.c:270: warning: comparison between signed and unsigned Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFS: Clean up error handling in nfs_get_sbChuck Lever
The error return logic in nfs_get_sb now matches nfs4_get_sb, and is more maintainable. A subsequent patch will take advantage of this simplification. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFS: Clean-up: Replace nfs_copy_user_string with strndup_userChuck Lever
The new string utility function strndup_user can be used instead of nfs_copy_user_string, eliminating an unnecessary duplication of function. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFS: Clean-up: Define macros for maximum host and export path name lengthsChuck Lever
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFS: Clean-up: use correct type when converting NFS blocks to local blocksChuck Lever
inode->i_blocks is a blkcnt_t these days, which can be a u64 or unsigned long, depending on the setting of CONFIG_LSF. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: Fix up stateid locking...Trond Myklebust
We really don't need to grab both the state->so_owner and the inode->i_lock. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: Clean up the callers of nfs4_open_recover_helper()Trond Myklebust
Rely on nfs4_try_open_cached() when appropriate. Also fix an RCU violation in _nfs4_do_open_reclaim() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: Don't call OPEN if we already have an open stateid for a fileTrond Myklebust
If we already have a stateid with the correct open mode for a given file, then we can reuse that stateid instead of re-issuing an OPEN call without violating the close-to-open caching semantics. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: Check for the existence of a delegation in nfs4_open_prepare()Trond Myklebust
We should not be calling open() on an inode that has a delegation unless we're doing a reclaim. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: Clean up _nfs4_proc_open()Trond Myklebust
Use a flag instead of the 'data->rpc_status = -ENOMEM hack. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: Allow nfs4_opendata_to_nfs4_state to return errors.Trond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: Improve the debugging of bad sequence id errors...Trond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: Always use the delegation if we have oneTrond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: Clean up confirmation of sequence ids...Trond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: Defer inode revalidation when setting up a delegationTrond Myklebust
Currently we force a synchronous call to __nfs_revalidate_inode() in nfs_inode_set_delegation(). This not only ensures that we cannot call nfs_inode_set_delegation from an asynchronous context, but it also slows down any call to open(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: Use RCU to protect delegationsTrond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: Support recalling delegations by stateid part 2Trond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: Support recalling delegations by stateidTrond Myklebust
There appear to be some rogue servers out there that issue multiple delegations with different stateids for the same file. Ensure that when we return delegations, we do so on a per-stateid basis rather than a per-file basis. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: Fix up a bug in nfs4_open_recover()Trond Myklebust
Don't clobber the delegation info... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: set the delegation in nfs4_opendata_to_nfs4_stateTrond Myklebust
This ensures that nfs4_open_release() and nfs4_open_confirm_release() can now handle an eventual delegation that was returned with out open. As such, it fixes a delegation "leak" when the user breaks out of an open call. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: Fix a bug in __nfs4_find_state_byownerTrond Myklebust
The test for state->state == 0 does not tell you that the stateid is in the process of being freed. It really tells you that the stateid is not yet initialised... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: Fix atomic open for execute...Trond Myklebust
Currently we do not check for the FMODE_EXEC flag as we should. For that particular case, we need to perform an ACCESS call to the server in order to check that the file is executable. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: Reduce the chances of an open_owner identifier collisionTrond Myklebust
Currently we just use a 32-bit counter. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: nfs_increment_open_seqid should not return a valueTrond Myklebust
It is a void function... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: Fix underestimate of NFSv4 lookup request sizeTrond Myklebust
Also fix up the underestimate of fs_locations Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: Fix the underestimate of NFSv4 open request sizeTrond Myklebust
The maximum size depends on the filename size and a number of other elements which are currently not being counted. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: Fix the NFSv4 owner and owner_group size estimatesTrond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: Don't reuse expired nfs4_state_owner structsTrond Myklebust
That just confuses certain NFSv4 servers. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: Fix a credential reference leak in nfs4_get_state_owner()Trond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFS: Replace NFS_I(inode)->req_lock with inode->i_lockTrond Myklebust
There is no justification for keeping a special spinlock for the exclusive use of the NFS writeback code. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10NFSv4: Clean up _nfs4_proc_lookup() vs _nfs4_proc_lookupfh()Trond Myklebust
They differ only slightly in the arguments they take. Why have they not been merged? Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10SUNRPC: Remove the tk_auth macro...Trond Myklebust
We should almost always be deferencing the rpc_auth struct by means of the credential's cr_auth field instead of the rpc_clnt->cl_auth anyway. Fix up that historical mistake, and remove the macro that propagated it. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10SUNRPC: Remove redundant calls to rpciod_up()/rpciod_down()Trond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10SUNRPC: Kill rpc_clnt->cl_oneshotTrond Myklebust
Replace it with explicit calls to rpc_shutdown_client() or rpc_destroy_client() (for the case of asynchronous calls). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>