aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2006-04-11[PATCH] splice: speedups and optimizationsJens Axboe
- Kill the local variables that cache ->nrbufs, they just take up space. - Only set do_wakeup for a real pipe. This is a big win for direct splicing. - Kill i_mutex lock around ->f_pos update, regular io paths don't do this either. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11[PATCH] pipe.c/fifo.c code cleanupsIngo Molnar
more code cleanups after the macro conversion: - standardize on 'struct pipe_inode_info *pipe' variable names - introduce 'pipe' temporaries to reduce mass inode->i_pipe dereferencing Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11[PATCH] get rid of the PIPE_*() macrosIngo Molnar
get rid of the PIPE_*() macros. Scripted transformation. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11[PATCH] splice: speedup __generic_file_splice_readJens Axboe
Using find_get_page() is a lot faster than find_or_create_page(). This gets splice a lot closer to sendfile() for fd -> socket transfers. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11[PATCH] splice: add direct fd <-> fd splicing supportJens Axboe
It's more efficient for sendfile() emulation. Basically we cache an internal private pipe and just use that as the intermediate area for pages. Direct splicing is not available from sys_splice(), it is only meant to be used for sendfile() emulation. Additional patch from Ingo Molnar to avoid the PIPE_BUFFERS loop at exit for the normal fast path. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10[PATCH] splice: add optional input and output offsetsIngo Molnar
add optional input and output offsets to sys_splice(), for seekable file descriptors: asmlinkage long sys_splice(int fd_in, loff_t __user *off_in, int fd_out, loff_t __user *off_out, size_t len, unsigned int flags); semantics are straightforward: f_pos will be updated with the offset provided by user-space, before the splice transfer is about to begin. Providing a NULL offset pointer means the existing f_pos will be used (and updated in situ). Providing an offset for a pipe results in -ESPIPE. Providing an invalid offset pointer results in -EFAULT. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10[PATCH] introduce a "kernel-internal pipe object" abstractionIngo Molnar
separate out the 'internal pipe object' abstraction, and make it usable to splice. This cleans up and fixes several aspects of the internal splice APIs and the pipe code: - pipes: the allocation and freeing of pipe_inode_info is now more symmetric and more streamlined with existing kernel practices. - splice: small micro-optimization: less pointer dereferencing in splice methods Signed-off-by: Ingo Molnar <mingo@elte.hu> Update XFS for the ->splice_read/->splice_write changes. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10[PATCH] splice: be smarter about calling do_page_cache_readahead()Jens Axboe
We don't want to call into the read-ahead logic unless we are at the start of a page, _or_ we have multiple pages to read. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10[PATCH] splice: optimize the splice buffer mappingJens Axboe
We don't really need to lock down the pages, just make sure they are uptodate. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10[PATCH] splice: cleanup __generic_file_splice_read()Jens Axboe
The whole shadow/pages logic got overly complex, and this simpler approach is actually faster in testing. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10[PATCH] splice: only call wake_up_interruptible() when we really have toJens Axboe
__wake_up_common() is pretty heavy in the kernel profiles, this brings it down to a more acceptable level. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10[PATCH] splice: potential !page dereferenceDave Jones
We can get to out: with a NULL page, which we probably don't want to be calling page_cache_release() on. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10[PATCH] splice: mark the io page as accessedJens Axboe
We should do that, since we do the LRU manipulation ourselves now. Suggested by Nick Piggin. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02Merge branch 'splice' of git://brick.kernel.dk/data/git/linux-2.6-blockLinus Torvalds
* 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block: [PATCH] splice: fix page stealing LRU handling. [PATCH] splice: page stealing needs to wait_on_page_writeback() [PATCH] splice: export generic_splice_sendpage [PATCH] splice: add a SPLICE_F_MORE flag [PATCH] splice: add comments documenting more of the code [PATCH] splice: improve writeback and clean up page stealing [PATCH] splice: fix shadow[] filling logic
2006-04-02[PATCH] splice: fix page stealing LRU handling.Jens Axboe
Originally from Nick Piggin, just adapted to the newer branch. You can't check PageLRU without holding zone->lru_lock. The page release code can get away with it only because the page refcount is 0 at that point. Also, you can't reliably remove pages from the LRU unless the refcount is 0. Ever. Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02[PATCH] splice: page stealing needs to wait_on_page_writeback()Jens Axboe
Thanks to Andrew for the good explanation of why this is so. akpm writes: If a page is under writeback and we remove it from pagecache, it's still going to get written to disk. But the VFS no longer knows about that page, nor that this page is about to modify disk blocks. So there might be scenarios in which those blocks-which-are-about-to-be-written-to get reused for something else. When writeback completes, it'll scribble on those blocks. This won't happen in ext2/ext3-style filesystems in normal mode because the page has buffers and try_to_release_page() will fail. But ext2 in nobh mode doesn't attach buffers at all - it just sticks the page in a BIO, finds some new blocks, points the BIO at those blocks and lets it rip. While that write IO's in flight, someone could truncate the file. Truncate won't block on the writeout because the page isn't in pagecache any more. So truncate will the free the blocks from the file under the page's feet. Then something else can reallocate those blocks. Then write data to them. Now, the original write completes, corrupting the filesystem. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02[PATCH] splice: export generic_splice_sendpageJens Axboe
Forgot that one, thanks Jeff. Also move the other EXPORT_SYMBOL to right below the functions. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02[PATCH] splice: add a SPLICE_F_MORE flagJens Axboe
This lets userspace indicate whether more data will be coming in a subsequent splice call. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02[PATCH] splice: add comments documenting more of the codeJens Axboe
Hopefully this will make Andrew a little more happy. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02[PATCH] splice: improve writeback and clean up page stealingJens Axboe
By cleaning up the writeback logic (killing write_one_page() and the manual set_page_dirty()), we can get rid of ->stolen inside the pipe_buffer and just keep it local in pipe_to_file(). This also adds dirty page balancing logic and O_SYNC handling. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02[PATCH] splice: fix shadow[] filling logicJens Axboe
Clear the entire range, and don't increment pidx or we keep filling the same position again and again. Thanks to KAMEZAWA Hiroyuki. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02Merge git://oss.sgi.com:8090/oss/git/xfs-2.6Linus Torvalds
* git://oss.sgi.com:8090/oss/git/xfs-2.6: [XFS] Provide XFS support for the splice syscall. [XFS] Reenable write barriers by default. [XFS] Make project quota enforcement return an error code consistent with [XFS] Implement the silent parameter to fill_super, previously ignored. [XFS] Cleanup comment to remove reference to obsoleted function
2006-04-02[PATCH] sysfs: zero terminate sysfs write buffersGreg Kroah-Hartman
No one should be writing a PAGE_SIZE worth of data to a normal sysfs file, so properly terminate the buffer. Thanks to Al Viro for pointing out my supidity here. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivialLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (48 commits) Documentation: fix minor kernel-doc warnings BUG_ON() Conversion in drivers/net/ BUG_ON() Conversion in drivers/s390/net/lcs.c BUG_ON() Conversion in mm/slab.c BUG_ON() Conversion in mm/highmem.c BUG_ON() Conversion in kernel/signal.c BUG_ON() Conversion in kernel/signal.c BUG_ON() Conversion in kernel/ptrace.c BUG_ON() Conversion in ipc/shm.c BUG_ON() Conversion in fs/freevxfs/ BUG_ON() Conversion in fs/udf/ BUG_ON() Conversion in fs/sysv/ BUG_ON() Conversion in fs/inode.c BUG_ON() Conversion in fs/fcntl.c BUG_ON() Conversion in fs/dquot.c BUG_ON() Conversion in md/raid10.c BUG_ON() Conversion in md/raid6main.c BUG_ON() Conversion in md/raid5.c Fix minor documentation typo BFP->BPF in Documentation/networking/tuntap.txt ...
2006-04-02splice: add SPLICE_F_NONBLOCK flagLinus Torvalds
It doesn't make the splice itself necessarily nonblocking (because the actual file descriptors that are spliced from/to may block unless they have the O_NONBLOCK flag set), but it makes the splice pipe operations nonblocking. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-02Documentation: fix minor kernel-doc warningsMartin Waitz
This patch updates the comments to match the actual code. Signed-off-by: Martin Waitz <tali@admingilde.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02BUG_ON() Conversion in fs/freevxfs/Eric Sesterhenn
this changes if() BUG(); constructs to BUG_ON() which is cleaner, contains unlikely() and can better optimized away. Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02BUG_ON() Conversion in fs/udf/Eric Sesterhenn
this changes if() BUG(); constructs to BUG_ON() which is cleaner, contains unlikely() and can better optimized away. Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02BUG_ON() Conversion in fs/sysv/Eric Sesterhenn
this changes if() BUG(); constructs to BUG_ON() which is cleaner, contains unlikely() and can better optimized away. Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02BUG_ON() Conversion in fs/inode.cEric Sesterhenn
this changes if() BUG(); constructs to BUG_ON() which is cleaner and can better optimized away Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02BUG_ON() Conversion in fs/fcntl.cEric Sesterhenn
this changes if() BUG(); constructs to BUG_ON() which is cleaner and can better optimized away Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02BUG_ON() Conversion in fs/dquot.cEric Sesterhenn
this changes if() BUG(); constructs to BUG_ON() which is cleaner and can better optimized away Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02Merge with git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.gitAdrian Bunk
2006-03-31Merge master.kernel.org:/pub/scm/linux/kernel/git/sfrench/cifs-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/sfrench/cifs-2.6: [CIFS] Fix typo in earlier cifs_unlink change and protect one [CIFS] Incorrect signature sent on SMB Read [CIFS] Fix unlink oops when indirectly called in rename error path [CIFS] Fix two remaining coverity scan tool warnings. [CIFS] Set correct lock type on new posix unlock call [CIFS] Upate cifs change log [CIFS] Fix slow oplock break response when mounts to different [CIFS] Workaround various server bugs found in testing at connectathon [CIFS] Allow fallback for setting file size to Procom SMB server when [CIFS] Make POSIX CIFS Extensions SetFSInfo match exactly what we want [CIFS] Move noisy debug message (triggerred by some older servers) from [CIFS] Use correct pid on new cifs posix byte range lock call [CIFS] Add posix (advisory) byte range locking support to cifs client [CIFS] CIFS readdir perf optimizations part 1 [CIFS] Free small buffers earlier so we exceed the cifs [CIFS] Fix large (ie over 64K for MaxCIFSBufSize) buffer case for wrapping [CIFS] Convert remaining places in fs/cifs from [CIFS] SessionSetup cleanup part 2 [CIFS] fix compile error (typo) and warning in cifssmb.c [CIFS] Cleanup NTLMSSP session setup handling
2006-04-01BUG_ON() Conversion in fs/sysfs/Eric Sesterhenn
this changes if() BUG(); constructs to BUG_ON() which is cleaner, contains unlikely() and can better optimized away. Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-01BUG_ON() Conversion in fs/smbfs/Eric Sesterhenn
this changes if() BUG(); constructs to BUG_ON() which is cleaner, contains unlikely() and can better optimized away. Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-01BUG_ON() Conversion in fs/jffs2/Eric Sesterhenn
this changes if() BUG(); constructs to BUG_ON() which is cleaner, contains unlikely() and can better optimized away. Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-01BUG_ON() Conversion in fs/hfsplus/Eric Sesterhenn
this changes if() BUG(); constructs to BUG_ON() which is cleaner, contains unlikely() and can better optimized away. Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-01BUG_ON() Conversion in fs/exec.cEric Sesterhenn
this changes if() BUG(); constructs to BUG_ON() which is cleaner and can better optimized away Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-01BUG_ON() Conversion in fs/direct-io.cEric Sesterhenn
this changes if() BUG(); constructs to BUG_ON() which is cleaner and can better optimized away Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-03-31[CIFS] Fix typo in earlier cifs_unlink change and protect oneSteve French
extra path. Since cifs_unlink can also be called from rename path and there was one report of oops am making the extra check for null inode. Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-03-31[CIFS] Incorrect signature sent on SMB ReadSteve French
Fixes Samba bug 3621 and kernel.org bug 6147 For servers which require SMB/CIFS packet signing, we were sending the wrong signature (all zeros) on SMB Read request. The new cifs routine to do signatures across an iovec was not complete - and SMB Read, unlike the new SMBWrite2, did not fall back to the older routine (ie use SendReceive vs. the more efficient SendReceive2 ie used the older cifs_sign_smb vs. the disabled cifs_sign_smb2) for calculating signatures. This finishes up cifs_sign_smb2/cifs_calc_signature2 so that the callers of SendReceive2 can get SMB/CIFS packet signatures. Now that cifs_sign_smb2 is supported, we could start using it in the write path but this smaller fix does not include the change to use SMBWrite2 when signatures are required (which when enabled will make more Writes more efficient and alloc less memory). Currently Write2 is only used when signatures are not required at the moment but after more testing we will enable that as well). Thanks to James Slepicka and Sam Flory for initial investigation. Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-03-31[PATCH] avoid unaligned access when accessing poll stackJes Sorensen
Commit 70674f95c0a2ea694d5c39f4e514f538a09be36f: [PATCH] Optimize select/poll by putting small data sets on the stack resulted in the poll stack being 4-byte aligned on 64-bit architectures, causing misaligned accesses to elements in the array. This patch fixes it by declaring the stack in terms of 'long' instead of 'char'. Force alignment of poll and select stacks to long to avoid unaligned access on 64 bit architectures. Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31[PATCH] fs/namei.c: make lookup_hash() staticAdrian Bunk
As announced, lookup_hash() can now become static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31[PATCH] dcache: Add helper d_hash_and_lookupEric W. Biederman
It is very common to hash a dentry and then to call lookup. If we take fs specific hash functions into account the full hash logic can get ugly. Further full_name_hash as an inline function is almost 100 bytes on x86 so having a non-inline choice in some cases can measurably decrease code size. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31[PATCH] cleanup in proc_check_chroot()Herbert Poetzl
proc_check_chroot() does the check in a very unintuitive way (keeping a copy of the argument, then modifying the argument), and has uncommented sideeffects. Signed-off-by: Herbert Poetzl <herbert@13thfloor.at> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31[PATCH] fs/locks.c: Fix sys_flock() raceTrond Myklebust
sys_flock() currently has a race which can result in a double free in the multi-thread case. Thread 1 Thread 2 sys_flock(file, LOCK_EX) sys_flock(file, LOCK_UN) If Thread 2 removes the lock from inode->i_lock before Thread 1 tests for list_empty(&lock->fl_link) at the end of sys_flock, then both threads will end up calling locks_free_lock for the same lock. Fix is to make flock_lock_file() do the same as posix_lock_file(), namely to make a copy of the request, so that the caller can always free the lock. This also has the side-effect of fixing up a reference problem in the lockd handling of flock. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31[PATCH] inotify: IN_DELETE events missingAmy Griffis
IN_DELETE events are no longer generated for the removal of a file from a watched directory. This seems to be a result of clearing DCACHE_INOTIFY_PARENT_WATCHED in d_delete() directly before calling fsnotify_nameremove(). Assuming the flag doesn't need to be cleared before dentry_iput(), this should do the trick. Signed-off-by: Amy Griffis <amy.griffis@hp.com> Cc: John McCutchan <ttb@tentacle.dhs.org> Acked-by: Robert Love <rml@novell.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31[PATCH] fat: kill reserved namesOGAWA Hirofumi
Since these names on old MSDOS is used as device, so, current fat driver doesn't allow a user to create those names. But many OSes and even Windows can create those names actually, now. This patch removes the reserved name check. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31[PATCH] sys_sync_file_range()Andrew Morton
Remove the recently-added LINUX_FADV_ASYNC_WRITE and LINUX_FADV_WRITE_WAIT fadvise() additions, do it in a new sys_sync_file_range() syscall instead. Reasons: - It's more flexible. Things which would require two or three syscalls with fadvise() can be done in a single syscall. - Using fadvise() in this manner is something not covered by POSIX. The patch wires up the syscall for x86. The sycall is implemented in the new fs/sync.c. The intention is that we can move sys_fsync(), sys_fdatasync() and perhaps sys_sync() into there later. Documentation for the syscall is in fs/sync.c. A test app (sync_file_range.c) is in http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz. The available-to-GPL-modules do_sync_file_range() is for knfsd: "A COMMIT can say NFS_DATA_SYNC or NFS_FILE_SYNC. I can skip the ->fsync call for NFS_DATA_SYNC which is hopefully the more common." Note: the `async' writeout mode SYNC_FILE_RANGE_WRITE will turn synchronous if the queue is congested. This is trivial to fix: add a new flag bit, set wbc->nonblocking. But I'm not sure that we want to expose implementation details down to that level. Note: it's notable that we can sync an fd which wasn't opened for writing. Same with fsync() and fdatasync()). Note: the code takes some care to handle attempts to sync file contents outside the 16TB offset on 32-bit machines. It makes such attempts appear to succeed, for best 32-bit/64-bit compatibility. Perhaps it should make such requests fail... Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>