Kernel - My Linux kernel repository

Age	Commit message (Collapse)	Author
2006-10-01	[PATCH] r/o bind mount prepwork: inc_nlink() helper	Dave Hansen
	This is mostly included for parity with dec_nlink(), where we will have some more hooks. This one should stay pretty darn straightforward for now. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01	[PATCH] r/o bind mounts: unlink: monitor i_nlink	Dave Hansen
	When a filesystem decrements i_nlink to zero, it means that a write must be performed in order to drop the inode from the filesystem. We're shortly going to have keep filesystems from being remounted r/o between the time that this i_nlink decrement and that write occurs. So, add a little helper function to do the decrements. We'll tie into it in a bit to note when i_nlink hits zero. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01	[PATCH] r/o bind mount prepwork: move open_namei()'s vfs_create()	Dave Hansen
	The code around vfs_create() in open_namei() is getting a bit too complex. Right now, there is at least the reference count on the dentry, and the i_mutex to worry about. Soon, we'll also have mnt_writecount. So, break the vfs_create() call out of open_namei(), and into a helper function. This duplicates the call to may_open(), but that isn't such a bad thing since the arguments (acc_mode and flag) were being heavily massaged anyway. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01	[PATCH] r/o bind mounts: prepare for write access checks: collapse if()	Dave Hansen
	We're shortly going to be adding a bunch more permission checks in these functions. That requires adding either a bunch of new if() conditions, or some gotos. This patch collapses existing if()s and uses gotos instead to prepare for the upcoming changes. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01	[PATCH] csa: convert CONFIG tag for extended accounting routines	Jay Lan
	There were a few accounting data/macros that are used in CSA but are #ifdef'ed inside CONFIG_BSD_PROCESS_ACCT. This patch is to change those ifdef's from CONFIG_BSD_PROCESS_ACCT to CONFIG_TASK_XACCT. A few defines are moved from kernel/acct.c and include/linux/acct.h to kernel/tsacct.c and include/linux/tsacct_kern.h. Signed-off-by: Jay Lan <jlan@sgi.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jes Sorensen <jes@sgi.com> Cc: Chris Sturtivant <csturtiv@sgi.com> Cc: Tony Ernst <tee@sgi.com> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01	[PATCH] Add vector AIO support	Badari Pulavarty
	This work is initially done by Zach Brown to add support for vectored aio. These are the core changes for AIO to support IOCB_CMD_PREADV/IOCB_CMD_PWRITEV. [akpm@osdl.org: huge build fix] Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Acked-by: Benjamin LaHaise <bcrl@kvack.org> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01	[PATCH] Streamline generic_file_* interfaces and filemap cleanups	Badari Pulavarty
	This patch cleans up generic_file__read/write() interfaces. Christoph Hellwig gave me the idea for this clean ups. In a nutshell, all filesystems should set .aio_read/.aio_write methods and use do_sync_read/ do_sync_write() as their .read/.write methods. This allows us to cleanup all variants of generic_file_ routines. Final available interfaces: generic_file_aio_read() - read handler generic_file_aio_write() - write handler generic_file_aio_write_nolock() - no lock write handler __generic_file_aio_write_nolock() - internal worker routine Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01	[PATCH] Remove readv/writev methods and use aio_read/aio_write instead	Badari Pulavarty
	This patch removes readv() and writev() methods and replaces them with aio_read()/aio_write() methods. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01	[PATCH] Vectorize aio_read/aio_write fileop methods	Badari Pulavarty
	This patch vectorizes aio_read() and aio_write() methods to prepare for collapsing all aio & vectored operations into one interface - which is aio_read()/aio_write(). Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Michael Holzheu <HOLZHEU@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01	[PATCH] reiserfs: eliminate minimum window size for bitmap searching	Jeff Mahoney
	When a file system becomes fragmented (using MythTV, for example), the bigalloc window searching ends up causing huge performance problems. In a file system presented by a user experiencing this bug, the file system was 90% free, but no 32-block free windows existed on the entire file system. This causes the allocator to scan the entire file system for each 128k write before backing down to searching for individual blocks. In the end, finding a contiguous window for all the blocks in a write is an advantageous special case, but one that can be found naturally when such a window exists anyway. This patch removes the bigalloc window searching, and has been proven to fix the test case described above. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01	[PATCH] reiserfs: use generic_file_open for open() checks	Jeff Mahoney
	The other common disk-based file systems (I checked ext[23], xfs, jfs) check to ensure that opens of files > 2 GB fail unless O_LARGEFILE is specified. They check via generic_file_open or their own open routine. ReiserFS doesn't have an f_op->open defined, and as such, it's possible to open files > 2 GB without O_LARGEFILE. This patch adds the f_op->open member to conform with the expected behavior. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Cc: <reiserfs-dev@namesys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01	[PATCH] reiserfs: on-demand bitmap loading	Jeff Mahoney
	This is the patch the three previous ones have been leading up to. It changes the behavior of ReiserFS from loading and caching all the bitmaps as special, to treating the bitmaps like any other bit of metadata and just letting the system-wide caches figure out what to hang on to. Buffer heads are allocated on the fly, so there is no need to retain pointers to all of them. The caching of the metadata occurs when the data is read and updated, and is considered invalid and uncached until then. I needed to remove the vs-4040 check for performing a duplicate operation on a particular bit. The reason is that while the other sites for working with bitmaps are allowed to schedule, is_reusable() is called from do_balance(), which will panic if a schedule occurs in certain places. The benefit of on-demand bitmaps clearly outweighs a sanity check that depends on a compile-time option that is discouraged. [akpm@osdl.org: warning fix] Signed-off-by: Jeff Mahoney <jeffm@suse.com> Cc: <reiserfs-dev@namesys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01	[PATCH] reiserfs: reorganize bitmap loading functions	Jeff Mahoney
	This patch moves the bitmap loading code from super.c to bitmap.c The code is also restructured somewhat. The only difference between new format bitmaps and old format bitmaps is where they are. That's a two liner before loading the block to use the correct one. There's no need for an entirely separate code path. The load path is generally the same, with the pattern being to throw out a bunch of requests and then wait for them, then cache the metadata from the contents. Again, like the previous patches, the purpose is to set up for later ones. Update: There was a bug in the previously posted version of this that resulted in corruption. The problem was that bitmap 0 on new format file systems must be treated specially, and wasn't. A stupid bug with an easy fix. This is hopefully the last fix for the disaster that is the reiserfs bitmap patch set. If a bitmap block was full, first_zero_hint would end up at zero since it would never be changed from it's zeroed out value. This just sets it beyond the end of the bitmap block. If any bits are freed, it will be reset to a valid bit. When info->free_count = 0, then we already know it's full. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Cc: <reiserfs-dev@namesys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01	[PATCH] reiserfs: clean up bitmap block buffer head references	Jeff Mahoney
	Similar to the SB_JOURNAL cleanup that was accepted a while ago, this patch uses a temporary variable for buffer head references from the bitmap info array. This makes the code much more readable in some areas. It also uses proper reference counting, doing a get_bh() after using the pointer from the array and brelse()'ing it later. This may seem silly, but a later patch will replace the simple temporary variables with an actual read, so the reference freeing will be used then. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Cc: <reiserfs-dev@namesys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01	[PATCH] reiserfs: fix is_reusable bitmap check to not traverse the bitmap ↵	Jeff Mahoney
	info array There is a check in is_reusable to determine if a particular block is a bitmap block. It verifies this by going through the array of bitmap block buffer heads and comparing the block number to each one. Bitmap blocks are at defined locations on the disk in both old and current formats. Simply checking against the known good values is enough. This is a trivial optimization for a non-production codepath, but this is the first in a series of patches that will ultimately remove the buffer heads from that array. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Cc: <reiserfs-dev@namesys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01	[PATCH] Move ncpfs 32bit compat ioctl to ncpfs	Petr Vandrovec
	The ncp specific compat ioctls are clearly local to one file system, so the code can better live there. This version of the patch moves everything into the generic ioctl handler and uses it for both 32 and 64 bit calls. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Petr Vandrovec <petr@vandrovec.name> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01	[PATCH] VFS: Use SEEK_{SET, CUR, END} instead of hardcoded values	Josef 'Jeff' Sipek
	VFS: Use SEEK_{SET,CUR,END} instead of hardcoded values Signed-off-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01	[PATCH] Create fs/utimes.c	Alexey Dobriyan
	* fs/open.c is getting bit crowdy * preparation to lutimes(2) Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01	[PATCH] kmemdup: some users	Alexey Dobriyan
	Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01	[PATCH] fs/partitions: Conversion to generic boolean	Richard Knutsson
	Conversion of booleans to: generic-boolean.patch (2006-08-23) Signed-off-by: Richard Knutsson <ricknu-0@student.ltu.se> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01	[PATCH] fs/jfs: Conversion to generic boolean	Richard Knutsson
	Conversion of booleans to: generic-boolean.patch (2006-08-23) Signed-off-by: Richard Knutsson <ricknu-0@student.ltu.se> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01	[PATCH] fs/ntfs: Conversion to generic boolean	Richard Knutsson
	Conversion of booleans to: generic-boolean.patch (2006-08-23) Signed-off-by: Richard Knutsson <ricknu-0@student.ltu.se> Signed-off-by: Anton Altaparmakov <aia21@cantab.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-30	[PATCH] Update axboe@suse.de email address	Jens Axboe
	As people often look for the copyright in files to see who to mail, update the link to a neutral one. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30	[PATCH] fix creating zero sized bio mempools in low memory system	Milan Broz
	In the very low memory systems is in the init_bio call scale parameter set to zero and it leads to creating zero sized mempool. This patch prevents pool_entries parameter become zero, so the created pool have at least 1 entry. Mempool with 0 entries lead to incorrect behaviour of mempool_free. (Alloc requests are not waken up and system stalls in mempool_alloc->ioschedule). Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30	[PATCH] CONFIG_BLOCK internal.h cleanups	Andrew Morton
	- forward declare struct superblock - use inlines, not macros Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30	[PATCH] BLOCK: Make it possible to disable the block layer [try #6]	David Howells
	Make it possible to disable the block layer. Not all embedded devices require it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require the block layer to be present. This patch does the following: () Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev support. () Adds dependencies on CONFIG_BLOCK to any configuration item that controls an item that uses the block layer. This includes: () Block I/O tracing. () Disk partition code. () All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS. () The SCSI layer. As far as I can tell, even SCSI chardevs use the block layer to do scheduling. Some drivers that use SCSI facilities - such as USB storage - end up disabled indirectly from this. () Various block-based device drivers, such as IDE and the old CDROM drivers. () MTD blockdev handling and FTL. () JFFS - which uses set_bdev_super(), something it could avoid doing by taking a leaf out of JFFS2's book. () Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and linux/elevator.h contingent on CONFIG_BLOCK being set. sector_div() is, however, still used in places, and so is still available. () Also made contingent are the contents of linux/mpage.h, linux/genhd.h and parts of linux/fs.h. () Makes a number of files in fs/ contingent on CONFIG_BLOCK. () Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK. () set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK is not enabled. () fs/no-block.c is created to hold out-of-line stubs and things that are required when CONFIG_BLOCK is not set: () Default blockdev file operations (to give error ENODEV on opening). () Makes some /proc changes: () /proc/devices does not list any blockdevs. () /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK. () Makes some compat ioctl handling contingent on CONFIG_BLOCK. () If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if given command other than Q_SYNC or if a special device is specified. () In init/do_mounts.c, no reference is made to the blockdev routines if CONFIG_BLOCK is not defined. This does not prohibit NFS roots or JFFS2. () The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return error ENOSYS by way of cond_syscall if so). () The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if CONFIG_BLOCK is not set, since they can't then happen. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30	[PATCH] BLOCK: Remove no-longer necessary linux/buffer_head.h inclusions ↵	David Howells
	[try #6] Remove inclusions of linux/buffer_head.h that are no longer necessary due to the transfer of a number of things out of there. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30	[PATCH] BLOCK: Remove no-longer necessary linux/mpage.h inclusions [try #6]	David Howells
	Remove inclusions of linux/mpage.h that are no longer necessary due to the transfer of generic_writepages(). Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30	[PATCH] BLOCK: Move the msdos device ioctl compat stuff to the msdos driver ↵	David Howells
	[try #6] Move the msdos device ioctl compat stuff from fs/compat_ioctl.c to the msdos driver so that the msdos header file doesn't need to be included. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30	[PATCH] BLOCK: Move the Ext3 device ioctl compat stuff to the Ext3 driver ↵	David Howells
	[try #6] Move the Ext3 device ioctl compat stuff from fs/compat_ioctl.c to the Ext3 driver so that the Ext3 header file doesn't need to be included. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30	[PATCH] BLOCK: Move the Ext2 device ioctl compat stuff to the Ext2 driver ↵	David Howells
	[try #6] Move the Ext2 device ioctl compat stuff from fs/compat_ioctl.c to the Ext2 driver so that the Ext2 header file doesn't need to be included. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30	[PATCH] BLOCK: Move the ReiserFS device ioctl compat stuff to the ReiserFS ↵	David Howells
	driver [try #6] Move the ReiserFS device ioctl compat stuff from fs/compat_ioctl.c to the ReiserFS driver so that the ReiserFS header file doesn't need to be included. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30	[PATCH] BLOCK: Move common FS-specific ioctls to linux/fs.h [try #6]	David Howells
	Move common FS-specific ioctls from linux/ext2_fs.h to linux/fs.h as FS_IOC_* and FS_IOC32_* and have the users of them use those as a base. Also move the GETFLAGS/SETFLAGS flags to linux/fs.h as FS_*_FL macros, and then have the other users use them as a base. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30	[PATCH] BLOCK: Move the loop device ioctl compat stuff to the loop driver ↵	David Howells
	[try #6] Move the loop device ioctl compat stuff from fs/compat_ioctl.c to the loop driver so that the loop header file doesn't need to be included. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30	[PATCH] BLOCK: Move __invalidate_device() to block_dev.c [try #6]	David Howells
	Move __invalidate_device() from fs/inode.c to fs/block_dev.c so that it can more easily be disabled when the block layer is disabled. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30	[PATCH] BLOCK: Dissociate generic_writepages() from mpage stuff [try #6]	David Howells
	Dissociate the generic_writepages() function from the mpage stuff, moving its declaration to linux/mm.h and actually emitting a full implementation into mm/page-writeback.c. The implementation is a partial duplicate of mpage_writepages() with all BIO references removed. It is used by NFS to do writeback. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30	[PATCH] BLOCK: Remove dependence on existence of blockdev_superblock [try #6]	David Howells
	Move blockdev_superblock extern declaration from fs/fs-writeback.c to a headerfile and remove the dependence on it by wrapping it in a macro. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30	[PATCH] BLOCK: Move extern declarations out of fs/*.c into header files [try #6]	David Howells
	Create a new header file, fs/internal.h, for common definitions local to the sources in the fs/ directory. Move extern definitions that should be in header files from fs/*.c to fs/internal.h or other main header files where they span directories. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30	[PATCH] BLOCK: Don't call block_sync_page() from AFS [try #6]	David Howells
	The AFS filesystem no longer needs to override its sync_page() op. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30	[PATCH] BLOCK: Move functions out of buffer code [try #6]	David Howells
	Move some functions out of the buffering code that aren't strictly buffering specific. This is a precursor to being able to disable the block layer. () Moved some stuff out of fs/buffer.c: () The file sync and general sync stuff moved to fs/sync.c. () The superblock sync stuff moved to fs/super.c. () do_invalidatepage() moved to mm/truncate.c. () try_to_release_page() moved to mm/filemap.c. () Moved some related declarations between header files: () declarations for do_invalidatepage() and try_to_release_page() moved to linux/mm.h. () __set_page_dirty_buffers() moved to linux/buffer_head.h. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30	[PATCH] Don't need to disable interrupts for tasklist_lock	Oleg Nesterov
	Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30	[PATCH] ext3: make meta data reads use READ_META	Jens Axboe
	Signed-off-by: Jens Axboe <axboe@suse.de>
2006-09-30	[PATCH] cfq-iosched: kill cfq_exit_lock	Jens Axboe
	cfq_exit_lock is protecting two things now: - The per-ioc rbtree of cfq_io_contexts - The per-cfqd linked list of cfq_io_contexts The per-cfqd linked list can be protected by the queue lock, as it is (by definition) per cfqd as the queue lock is. The per-ioc rbtree is mainly used and updated by the process itself only. The only outside use is the io priority changing. If we move the priority changing to not browsing the rbtree, we can remove any locking from the rbtree updates and lookup completely. Let the sys_ioprio syscall just mark processes as having the iopriority changed and lazily update the private cfq io contexts the next time io is queued, and we can remove this locking as well. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-09-29	Merge git://oss.sgi.com:8090/xfs/xfs-2.6	Linus Torvalds
	* git://oss.sgi.com:8090/xfs/xfs-2.6: (49 commits) [XFS] Remove v1 dir trace macro - missed in a past commit. [XFS] 955947: Infinite loop in xfs_bulkstat() on formatter() error [XFS] pv 956241, author: nathans, rv: vapo - make ino validation checks [XFS] pv 956240, author: nathans, rv: vapo - Minor fixes in [XFS] Really fix use after free in xfs_iunpin. [XFS] Collapse sv_init and init_sv into just the one interface. [XFS] standardize on one sema init macro [XFS] Reduce endian flipping in alloc_btree, same as was done for [XFS] Minor cleanup from dio locking fix, remove an extra conditional. [XFS] Fix kmem_zalloc_greedy warnings on 64 bit platforms. [XFS] pv 955157, rv bnaujok - break the loop on EFAULT formatter() error [XFS] pv 955157, rv bnaujok - break the loop on formatter() error [XFS] Fixes the leak in reservation space because we weren't ungranting [XFS] Add lock annotations to xfs_trans_update_ail and [XFS] Fix a porting botch on the realtime subvol growfs code path. [XFS] Minor code rearranging and cleanup to prevent some coverity false [XFS] Remove a no-longer-correct debug assert from dio completion [XFS] Add a greedy allocation interface, allocating within a min/max size [XFS] Improve error handling for the zero-fsblock extent detection code. [XFS] Be more defensive with page flags (error/private) for metadata ...
2006-09-29	[PATCH] Kcore elf note namesz field fix	Vivek Goyal
	o As per ELF specifications, it looks like that elf note "namesz" field contains the length of "name" including the size of null character. And currently we are filling "namesz" without taking into the consideration the null character size. o Kexec-tools performs this check deligently hence I ran into the issue while trying to open /proc/kcore in kexec-tools for some info. Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29	[PATCH] expand_fdtable(): remove pointless unlock+lock	Andrew Morton
	This unlock/lock on a super-unlikely path isn't worth the kernel text. Cc: Vadim Lobanov <vlobanov@speakeasy.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29	[PATCH] Clean up expand_fdtable() and expand_files()	Vadim Lobanov
	Perform a code cleanup against the expand_fdtable() and expand_files() functions inside fs/file.c. It aims to make the flow of code within these functions simpler and easier to understand, via added comments and modest refactoring. Signed-off-by: Vadim Lobanov <vlobanov@speakeasy.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29	[PATCH] Access Control Lists for tmpfs	Andreas Gruenbacher
	Add access control lists for tmpfs. Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29	[PATCH] Generic infrastructure for acls	Andreas Gruenbacher
	The patches solve the following problem: We want to grant access to devices based on who is logged in from where, etc. This includes switching back and forth between multiple user sessions, etc. Using ACLs to define device access for logged-in users gives us all the flexibility we need in order to fully solve the problem. Device special files nowadays usually live on tmpfs, hence tmpfs ACLs. Different distros have come up with solutions that solve the problem to different degrees: SUSE uses a resource manager which tracks login sessions and sets ACLs on device inodes as appropriate. RedHat uses pam_console, which changes the primary file ownership to the logged-in user. Others use a set of groups that users must be in in order to be granted the appropriate accesses. The freedesktop.org project plans to implement a combination of a console-tracker and a HAL-device-list based solution to grant access to devices to users, and more distros will likely follow this approach. These patches have first been posted here on 2 February 2005, and again on 8 January 2006. We have been shipping them in SLES9 and SLES10 with no problems reported. The previous submission is archived here: http://lkml.org/lkml/2006/1/8/229 http://lkml.org/lkml/2006/1/8/230 http://lkml.org/lkml/2006/1/8/231 This patch: Add some infrastructure for access control lists on in-memory filesystems such as tmpfs. Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29	[PATCH] enforce RLIMIT_NOFILE in poll()	Chris Snook
	POSIX states that poll() shall fail with EINVAL if nfds > OPEN_MAX. In this context, POSIX is referring to sysconf(OPEN_MAX), which is the value of current->signal->rlim[RLIMIT_NOFILE].rlim_cur in the linux kernel, not the compile-time constant which happens to also be named OPEN_MAX. In the current code, an application may poll up to max_fdset file descriptors, even if this exceeds RLIMIT_NOFILE. The current code also breaks applications which poll more than max_fdset descriptors, which worked circa 2.4.18 when the check was against NR_OPEN, which is 1024*1024. This patch enforces the limit precisely as POSIX defines, even if RLIMIT_NOFILE has been changed at run time with ulimit -n. To elaborate on the rationale for this, there are three cases: 1) RLIMIT_NOFILE is at the default value of 1024 In this (default) case, the patch changes nothing. Calls with nfds > 1024 fail with EINVAL both before and after the patch, and calls with nfds <= 1024 pass the check both before and after the patch, since 1024 is the initial value of max_fdset. 2) RLIMIT_NOFILE has been raised above the default In this case, poll() becomes more permissive, allowing polling up to RLIMIT_NOFILE file descriptors even if less than 1024 have been opened. The patch won't introduce new errors here. If an application somehow depends on poll() failing when it polls with duplicate or invalid file descriptors, it's already broken, since this is already allowed below 1024, and will also work above 1024 if enough file descriptors have been open at some point to cause max_fdset to have been increased above nfds. 3) RLIMIT_NOFILE has been lowered below the default In this case, the system administrator or the user has gone out of their way to protect the system from inefficient (or malicious) applications wasting kernel memory. The current code allows polling up to 1024 file descriptors even if RLIMIT_NOFILE is much lower, which is not what the user or administrator intended. Well-written applications which only poll valid, unique file descriptors will never notice the difference, because they'll hit the limit on open() first. If an application gets broken because of the patch in this case, then it was already poorly/maliciously designed, and allowing it to work in the past was a violation of POSIX and a DoS risk on low-resource systems. With this patch, poll() will permit exactly what POSIX suggests, no more, no less, and for any run-time value set with ulimit -n, not just 256 or 1024. There are existing apps which which poll a large number of file descriptors, some of which may be invalid, and if those numbers stradle 1024, they currently fail with or without the patch in -mm, though they worked fine under 2.4.18. Signed-off-by: Chris Snook <csnook@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>