aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2005-10-18NFSv4: Add missing handling of OPEN_CONFIRM requests on CLAIM_DELEGATE_CUR.Trond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18NFSv4: Remove nfs4_client->cl_sem from close() pathTrond Myklebust
We no longer need to worry about collisions between close() and the state recovery code, since the new close will automatically recheck the file state once it is done waiting on its sequence slot. Ditto for the nfs4_proc_locku() procedure. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18NFSv4: Remove obsolete state_owner and lock_owner semaphoresTrond Myklebust
OPEN, CLOSE, etc no longer need these semaphores to ensure ordering of requests. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18NFSv4: Fix a potential CLOSE raceTrond Myklebust
Once the state_owner and lock_owner semaphores get removed, it will be possible for other OPEN requests to reopen the same file if they have lower sequence ids than our CLOSE call. This patch ensures that we recheck the file state once nfs_wait_on_sequence() has completed waiting. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18NFSv4: Add functions to order RPC callsTrond Myklebust
NFSv4 file state-changing functions such as OPEN, CLOSE, LOCK,... are all labelled with "sequence identifiers" in order to prevent the server from reordering RPC requests, as this could cause its file state to become out of sync with the client. Currently the NFS client code enforces this ordering locally using semaphores to restrict access to structures until the RPC call is done. This, of course, only works with synchronous RPC calls, since the user process must first grab the semaphore. By dropping semaphores, and instead teaching the RPC engine to hold the RPC calls until they are ready to be sent, we can extend this process to work nicely with asynchronous RPC calls too. This patch adds a new list called "rpc_sequence" that defines the order of the RPC calls to be sent. We add one such list for each state_owner. When an RPC call is ready to be sent, it checks if it is top of the rpc_sequence list. If so, it proceeds. If not, it goes back to sleep, and loops until it hits top of the list. Once the RPC call has completed, it can then bump the sequence id counter, and remove itself from the rpc_sequence list, and then wake up the next sleeper. Note that the state_owner sequence ids and lock_owner sequence ids are all indexed to the same rpc_sequence list, so OPEN, LOCK,... requests are all ordered w.r.t. each other. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18Merge /home/trondmy/scm/kernel/git/torvalds/linux-2.6Trond Myklebust
2005-10-17[PATCH] aio: revert lock_kiocb()Zach Brown
lock_kiocb() was introduced to serialize retrying and cancellation. In the process of doing so it tried to sleep waiting for KIF_LOCKED while holding the ctx_lock spinlock. Recent fixes have ensured that multiple concurrent retries won't be attempted for a given iocb. Cancel has other problems and has no significant in-tree users that have been complaining about it. So for the immediate future we'll revert sleeping with the lock held and will address proper cancellation and retry serialization in the future. Signed-off-by: Zach Brown <zach.brown@oracle.com> Acked-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-17[PATCH] output of /proc/maps on nommu systems is incompleteDavid McCullough
Currently you do not get all the map entries on nommu systems because the start function doesn't index into the list using the value of "pos". Signed-off-by: David McCullough <davidm@snapgear.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-17[PATCH] NFS: Fix Oopsable/unnecessary i_count manipulations in ↵Trond Myklebust
nfs_wait_on_inode() Oopsable since nfs_wait_on_inode() can get called as part of iput_final(). Unnecessary since the caller had better be damned sure that the inode won't disappear from underneath it anyway. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-17[PATCH] NFS: Fix cache consistency racesTrond Myklebust
If the data cache has been marked as potentially invalid by nfs_refresh_inode, we should invalidate it rather than assume that changes are due to our own activity. Also ensure that we always start with a valid cache before declaring it to be protected by a delegation. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-14[PATCH] nommu build error fixYoshinori Sato
"proc_smaps_operations" is not defined in case of "CONFIG_MMU=n". Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-11[PATCH] binfmt_elf bss padding fixakpm@osdl.org
Nir Tzachar <tzachar@cs.bgu.ac.il> points out that if an ELF file specifies a zero-length bss at a whacky address, we cannot load that binary because padzero() tries to zero out the end of the page at the whacky address, and that may not be writeable. See also http://bugzilla.kernel.org/show_bug.cgi?id=5411 So teach load_elf_binary() to skip the bss settng altogether if the elf file has a zero-length bss segment. Cc: Roland McGrath <roland@redhat.com> Cc: Daniel Jacobowitz <dan@debian.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-11[PATCH] nfsacl: Solaris VxFS compatibility fixAndreas Gruenbacher
Here is a compatibility fix between Linux and Solaris when used with VxFS filesystems: Solaris usually accepts acl entries in any order, but with VxFS it replies with NFSERR_INVAL when it sees a four-entry acl that is not in canonical form. It may also fail with other non-canonical acls -- I can't tell, because that case never triggers: We only send non-canonical acls when we fake up an ACL_MASK entry. Instead of adding fake ACL_MASK entries at the end, inserting them in the correct position makes Solaris+VxFS happy. The Linux client and server sides don't care about entry order. The three-entry-acl special case in which we need a fake ACL_MASK entry was handled in xdr_nfsace_encode. The patch moves this into nfsacl_encode. Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-11[PATCH] v9fs: remove additional buffer allocation from v9fs_file_read and ↵Latchesar Ionkov
v9fs_file_write v9fs_file_read and v9fs_file_write use kmalloc to allocate buffers as big as the data buffer received as parameter. kmalloc cannot be used to allocate buffers bigger than 128K, so reading/writing data in chunks bigger than 128k fails. This patch reorganizes v9fs_file_read and v9fs_file_write to allocate only buffers as big as the maximum data that can be sent in one 9P message. Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Cc: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-11JFS: Corrupted block map should not cause trapDave Kleikamp
Replace assert statements with better error handling. Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-10-10[PATCH] relayfs: fix bogus param value in call to vmapTom Zanussi
The third param in this call to vmap shouldn't be GFP_KERNEL, which makes no sense, but rather VM_MAP. Thanks to Al Viro for spotting this. Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-08[PATCH] gfp flags annotations - part 1Al Viro
- added typedef unsigned int __nocast gfp_t; - replaced __nocast uses for gfp flags with gfp_t - it gives exactly the same warnings as far as sparse is concerned, doesn't change generated code (from gcc point of view we replaced unsigned int with typedef) and documents what's going on far better. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-06Avoid 'names_cache' memory leak with CONFIG_AUDITSYSCALLLinus Torvalds
The nameidata "last.name" is always allocated with "__getname()", and should always be free'd with "__putname()". Using "putname()" without the underscores will leak memory, because the allocation will have been hidden from the AUDITSYSCALL code. Arguably the real bug is that the AUDITSYSCALL code is really broken, but in the meantime this fixes the problem people see. Reported by Robert Derr, patch by Rick Lindsley. Acked-by: Al Viro <viro@ftp.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-04[PATCH] bfs iget() abusesAl Viro
bfs_fill_super() walks the inode table to get the bitmap of free inodes and collect stats. It has no business using iget() for that - it's a lot of extra work, extra icache pollution and more complex code. Switched to walking the damn thing directly. Note: that also allows to kill ->i_dsk_ino in there - separate patch if Tigran can confirm that this field can be zero only for deleted inodes (i.e. something that could only be found during that scan and not by normal lookups). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-04[PATCH] bfs endianness annotationsAlexey Dobriyan
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-04NTFS: Fix a 64-bitness bug where a left-shift could overflow a 32-bit variableAnton Altaparmakov
which we now cast to 64-bit first (fs/ntfs/mft.c::map_mft_record_page(). Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-10-04NTFS: Fix a stupid bug in __ntfs_bitmap_set_bits_in_run() which caused theAnton Altaparmakov
count to become negative and hence we had a wild memset() scribbling all over the system's ram. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-10-03JFS: make special inodes play nicely with page balancingDave Kleikamp
This patch fixes up a few problems with jfs's reserved inodes. 1. There is no need for the jfs code setting the I_DIRTY bits in i_state. I am ashamed that the code ever did this, and surprised it hasn't been noticed until now. 2. Make sure special inodes are on an inode hash list. If the inodes are unhashed, __mark_inode_dirty will fail to put the inode on the superblock's dirty list, and the data will not be flushed under memory pressure. 3. Force writing journal data to disk when metapage_writepage is unable to write a metadata page due to pending journal I/O. Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-09-30[PATCH] fuse: check O_DIRECTMiklos Szeredi
Check O_DIRECT and return -EINVAL error in open. dentry_open() also checks this but only after the open method is called. This patch optimizes away the unnecessary upcalls in this case. It could be a correctness issue too: if filesystem has open() with side effect, then it should fail before doing the open, not after. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-30[PATCH] uml: remove empty hostfs_truncate methodPaolo 'Blaisorblade' Giarrusso
Calling truncate() on hostfs spits a kernel warning "Something isn't implemented here", but it still works fine. Indeed, hostfs i_op->truncate doesn't do anything. But hostfs_setattr() -> set_attr() correctly detects ATTR_SIZE and calls truncate() on the host. So we should be safe (using ftruncate() may be better, in case the file is unlinked on the host, but we aren't sure to have the file open for writing, and reopening it would cause the same races; plus nobody should expect UML to be so careful). So, the warning is wrong, because the current implementation is working. Al, am I correct, and can the warning be therefore dropped? CC: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-30[PATCH] aio: avoid extra aio_{read,write} call when ki_left == 0Zach Brown
Recently aio_p{read,write} changed to perform retries internally rather than returning -EIOCBRETRY. This inadvertantly resulted in always calling aio_{read,write} with ki_left at 0 which would in turn immediately return 0. Harmless, but we can avoid this call by checking in the caller. Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: Benjamin LaHaise <bcrl@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-30[PATCH] aio: remove unlocked task_list test and resulting raceZach Brown
Only one of the run or kick path is supposed to put an iocb on the run list. If both of them do it than one of them can end up referencing a freed iocb. The kick path could delete the task_list item from the wait queue before getting the ctx_lock and putting the iocb on the run list. The run path was testing the task_list item outside the lock so that it could catch ki_retry methods that return -EIOCBRETRY *without* putting the iocb on a wait queue and promising to call kick_iocb. This unlocked check could then race with the kick path to cause both to try and put the iocb on the run list. The patch stops the run path from testing task_list by requring that any ki_retry that returns -EIOCBRETRY *must* guarantee that kick_iocb() will be called in the future. aio_p{read,write}, the only in-tree -EIOCBRETRY users, are updated. Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: Benjamin LaHaise <bcrl@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-30[PATCH] aio: lock around kiocbTryKick()Zach Brown
Only one of the run or kick path is supposed to put an iocb on the run list. If both of them do it than one of them can end up referencing a freed iocb. The kick patch could set the Kicked bit before acquiring the ctx_lock and putting the iocb on the run list. The run path, while holding the ctx_lock, could see this partial kick and mistake it for a kick that was deferred while it was doing work with the run_list NULLed out. It would then race with the kick thread to add the iocb to the run list. This patch moves the kick setting under the ctx_lock so that only one of the kick or run path queues the iocb on the run list, as intended. Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: Benjamin LaHaise <bcrl@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-30[PATCH] missing ERR_PTR in 9fsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-29[PATCH] readv/writev syscalls are not checked by lsmKostik Belousov
it seems that readv(2)/writev(2) syscalls do not call file_permission callback. Looks like this is overlook. I have filled the issue into redhat bugzilla as https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=169433 and got the recommendation to post this on lsm mailing list. The following trivial patch solves the problem. Signed-off-by: Kostik Belousov <kostikbel@gmail.com> Signed-off-by: Chris Wright <chrisw@osdl.org>
2005-09-28[PATCH] epoll: handle timeout overflowDavide Libenzi
Handle the timeout upper boundary for epoll. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-28[PATCH] v9fs: fix races in fid allocationLatchesar Ionkov
Fid management cleanup. The patch attempts to fix the races in dentry's fid management. Dentries don't keep the opened fids anymore, they are moved to the file structs. Ideally there should be no more than one fid with fidcreate equal to zero in the dentry's list of fids. v9fs_fid_create initializes the important fields (fid, fidcreated) before v9fs_fid is added to the list. v9fs_fid_lookup returns only fids that are not created by v9fs_create. v9fs_fid_get_created returns the fid created by the same process by v9fs_create (if any) and removes it from dentry's list Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Cc: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-28[PATCH] Fix ext3_new_inode() failure pathsChris Sykes
Fix failure paths in ext3_new_inode() and clean up duplicated code: - DQUOT_DROP() was not being called if ext3_init_security() failed. Signed-off-by: Chris Sykes <chris@sigsegv.plus.com> Cc: Stephen Smalley <sds@epoch.ncsc.mil> Cc: Jan Kara <jack@ucw.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-28[PATCH] Fix ext2_new_inode() failure pathsChris Sykes
Fix failure paths in ext2_new_inode() and clean up duplicated code: - DQUOT_DROP() was not being called if ext2_init_security() failed. Signed-off-by: Chris Sykes <chris@sigsegv.plus.com> Cc: Stephen Smalley <sds@epoch.ncsc.mil> Cc: Jan Kara <jack@ucw.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-28[PATCH] fuse: check reserved node ID valuesMiklos Szeredi
This patch checks reserved node ID values returned by lookup and creation operations. In case one of the reserved values is sent, return -EIO. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-28[PATCH] fuse: add required version infoMiklos Szeredi
Add information about required version of the userspace library/utilities to Documentation/Changes. Also add pointer to this and to FUSE documentation from Kconfig. Thanks to Anton Altaparmakov for the reminder. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-26NTFS: Re-fix sparse warnings in a more correct way, i.e. don't use an enum withAnton Altaparmakov
different types in it but #define the two constants instead. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-26Merge branch 'master' of /home/src/linux-2.6/Anton Altaparmakov
2005-09-26NTFS: More $LogFile handling fixes: when chkdsk has been run, it can leave theAnton Altaparmakov
restart pages in the journal without multi sector transfer protection fixups (i.e. the update sequence array is empty and in fact does not exist). Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-26NTFS: Fix the definition of the CHKD ntfs record magic. It had an off byAnton Altaparmakov
two error causing it to be CHKB instead of CHKD. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-23[PATCH] cifs: Add support for suspendSteve French
cifsd had been preventing software suspend from completing. Signed-off-by: pavel@suse.de Signed-off-by: Steve French <sfrench@us.ibm.com> lightly modified Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-23Revert "[PATCH] RPC,NFS: new rpc_pipefs patch"Trond Myklebust
This reverts 17f4e6febca160a9f9dd4bdece9784577a2f4524 commit.
2005-09-23NFS: Make /proc/mounts display the protocol used by NFSv4Trond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23[PATCH] RPC,NFS: new rpc_pipefs patchChristoph Hellwig
Currently rpc_mkdir/rpc_rmdir and rpc_mkpipe/mk_unlink have an API that's a little unfortunate. They take a path relative to the rpc_pipefs root and thus need to perform a full lookup. If you look at debugfs or usbfs they always store the dentry for directories they created and thus can pass in a dentry + single pathname component pair into their equivalents of the above functions. And in fact rpc_pipefs actually stores a dentry for all but one component so this change not only simplifies the core rpc_pipe code but also the callers. Unfortuntately this code path is only used by the NFS4 idmapper and AUTH_GSSAPI for which I don't have a test enviroment. Could someone give it a spin? It's the last bit needed before we can rework the lookup_hash API Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23[PATCH] RPC: parametrize various transport connect timeoutsChuck Lever
Each transport implementation can now set unique bind, connect, reestablishment, and idle timeout values. These are variables, allowing the values to be modified dynamically. This permits exponential backoff of any of these values, for instance. As an example, we implement exponential backoff for the connection reestablishment timeout. Test-plan: Destructive testing (unplugging the network temporarily). Connectathon with UDP and TCP. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23[PATCH] RPC: remove xprt->nocongChuck Lever
Get rid of the "xprt->nocong" variable. Test-plan: Use WAN simulation to cause sporadic bursty packet loss with UDP mounts. Look for significant regression in performance or client stability. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23[PATCH] RPC: get rid of xprt->streamChuck Lever
Now we can fix up the last few places that use the "xprt->stream" variable, and get rid of it from the rpc_xprt structure. Test-plan: Destructive testing (unplugging the network temporarily). Connectathon with UDP and TCP. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23[PATCH] NFS: use a constant value for TCP retransmit timeoutsChuck Lever
Implement a best practice: don't use exponential backoff when computing retransmit timeout values on TCP connections, but simply retransmit at regular intervals. This also fixes a bug introduced when xprt_reset_majortimeo() was added. Test-plan: Enable RPC debugging and watch timeout behavior on a NFS/TCP mount. Version: Thu, 11 Aug 2005 16:02:19 -0400 Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23NFS: Don't expose internal READDIR errors to userspaceTrond Myklebust
Fixes a condition whereby the kernel is returning the non-POSIX error EBADCOOKIE to userspace. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-09-23From: Olaf Kirch <okir@suse.de>Olaf Kirch
[PATCH] Fix miscompare in __posix_lock_file If an application requests the same lock twice, the kernel should just leave the existing lock in place. Currently, it will install a second lock of the same type. Signed-off-by: Olaf Kirch <okir@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>