aboutsummaryrefslogtreecommitdiff
path: root/fs/xfs
AgeCommit message (Collapse)Author
2009-09-24Merge branch 'hwpoison' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6 * 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6: (21 commits) HWPOISON: Enable error_remove_page on btrfs HWPOISON: Add simple debugfs interface to inject hwpoison on arbitary PFNs HWPOISON: Add madvise() based injector for hardware poisoned pages v4 HWPOISON: Enable error_remove_page for NFS HWPOISON: Enable .remove_error_page for migration aware file systems HWPOISON: The high level memory error handler in the VM v7 HWPOISON: Add PR_MCE_KILL prctl to control early kill behaviour per process HWPOISON: shmem: call set_page_dirty() with locked page HWPOISON: Define a new error_remove_page address space op for async truncation HWPOISON: Add invalidate_inode_page HWPOISON: Refactor truncate to allow direct truncating of page v2 HWPOISON: check and isolate corrupted free pages v2 HWPOISON: Handle hardware poisoned pages in try_to_unmap HWPOISON: Use bitmask/action code for try_to_unmap behaviour HWPOISON: x86: Add VM_FAULT_HWPOISON handling to x86 page fault handler v2 HWPOISON: Add poison check to page fault handling HWPOISON: Add basic support for poisoned pages in fault handler v3 HWPOISON: Add new SIGBUS error codes for hardware poison signals HWPOISON: Add support for poison swap entries v2 HWPOISON: Export some rmap vma locking to outside world ...
2009-09-24sysctl: remove "struct file *" argument of ->proc_handlerAlexey Dobriyan
It's unused. It isn't needed -- read or write flag is already passed and sysctl shouldn't care about the rest. It _was_ used in two places at arch/frv for some reason. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "David S. Miller" <davem@davemloft.net> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-22Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits) trivial: fix typo in aic7xxx comment trivial: fix comment typo in drivers/ata/pata_hpt37x.c trivial: typo in kernel-parameters.txt trivial: fix typo in tracing documentation trivial: add __init/__exit macros in drivers/gpio/bt8xxgpio.c trivial: add __init macro/ fix of __exit macro location in ipmi_poweroff.c trivial: remove unnecessary semicolons trivial: Fix duplicated word "options" in comment trivial: kbuild: remove extraneous blank line after declaration of usage() trivial: improve help text for mm debug config options trivial: doc: hpfall: accept disk device to unload as argument trivial: doc: hpfall: reduce risk that hpfall can do harm trivial: SubmittingPatches: Fix reference to renumbered step trivial: fix typos "man[ae]g?ment" -> "management" trivial: media/video/cx88: add __init/__exit macros to cx88 drivers trivial: fix typo in CONFIG_DEBUG_FS in gcov doc trivial: fix missing printk space in amd_k7_smp_check trivial: fix typo s/ketymap/keymap/ in comment trivial: fix typo "to to" in multiple files trivial: fix typos in comments s/DGBU/DBGU/ ...
2009-09-22const: mark remaining super_operations constAlexey Dobriyan
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-22const: make struct super_block::s_qcop constAlexey Dobriyan
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-21trivial: fix typo "for for" in multiple filesAnand Gadiyar
trivial: fix typo "for for" in multiple files Signed-off-by: Anand Gadiyar <gadiyar@ti.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-09-16HWPOISON: Enable .remove_error_page for migration aware file systemsAndi Kleen
Enable removing of corrupted pages through truncation for a bunch of file systems: ext*, xfs, gfs2, ocfs2, ntfs These should cover most server needs. I chose the set of migration aware file systems for this for now, assuming they have been especially audited. But in general it should be safe for all file systems on the data area that support read/write and truncate. Caveat: the hardware error handler does not take i_mutex for now before calling the truncate function. Is that ok? Cc: tytso@mit.edu Cc: hch@infradead.org Cc: mfasheh@suse.com Cc: aia21@cantab.net Cc: hugh.dickins@tiscali.co.uk Cc: swhiteho@redhat.com Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-09-15Merge branch 'master' of git://oss.sgi.com/xfs/xfs into for-linusAlex Elder
Conflicts: fs/xfs/linux-2.6/xfs_lrw.c
2009-09-15xfs: includecheck fix for fs/xfs/xfs_iops.cJaswinder Singh Rajput
fix the following 'make includecheck' warning: fs/xfs/linux-2.6/xfs_iops.c: xfs_acl.h is included more than once. Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com>
2009-09-15xfs: switch to seq_fileAlexey Dobriyan
create_proc_read_entry() is getting deprecated. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com>
2009-09-14xfs: Convert sync_page_range() to simple filemap_write_and_wait_range()Jan Kara
Christoph Hellwig says that it is enough for XFS to call filemap_write_and_wait_range() instead of sync_page_range() because we do all the metadata syncing when forcing the log. CC: Felix Blyakher <felixb@sgi.com> CC: xfs@oss.sgi.com CC: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2009-09-09xfs: use correct log reservation when handling ENOSPC in xfs_createChristoph Hellwig
We added the ENOSPC handling patch in xfs_create just after it got mered with xfs_mkdir. Change the log reservation to the variable for either the create or mkdir value so it does the right thing if get here for creating a directory. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2009-09-08jffs2/jfs/xfs: switch over to 'check_acl' rather than 'permission()'Linus Torvalds
This avoids an indirect call in the VFS for each path component lookup. Well, at least as long as you own the directory in question, and the ACL check is unnecessary. Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-02xfs: xfs_showargs() reports group *and* project quotas enabledAlex Elder
If you enable group or project quotas on an XFS file system, then the mount table presented through /proc/self/mounts erroneously shows that both options are in effect for the file system. The root of the problem is some bad logic in the xfs_showargs() function, which is used to format the file system type-specific options in effect for a file system. The problem originated in this GIT commit: Move platform specific mount option parse out of core XFS code Date: 11/22/07 Author: Dave Chinner SHA1 ID: a67d7c5f5d25d0b13a4dfb182697135b014fa478 For XFS quotas, project and group quota management are mutually exclusive--only one can be in effect at a time. There are two parts to managing quotas: aggregating usage information; and enforcing limits. It is possible to have a quota in effect (aggregating usage) but not enforced. These features are recorded on an XFS mount point using these flags: XFS_PQUOTA_ACCT - Project quotas are aggregated XFS_GQUOTA_ACCT - Group quotas are aggregated XFS_OQUOTA_ENFD - Project/group quotas are enforced The code in error is in fs/xfs/linux-2.6/xfs_super.c: if (mp->m_qflags & (XFS_PQUOTA_ACCT|XFS_OQUOTA_ENFD)) seq_puts(m, "," MNTOPT_PRJQUOTA); else if (mp->m_qflags & XFS_PQUOTA_ACCT) seq_puts(m, "," MNTOPT_PQUOTANOENF); if (mp->m_qflags & (XFS_GQUOTA_ACCT|XFS_OQUOTA_ENFD)) seq_puts(m, "," MNTOPT_GRPQUOTA); else if (mp->m_qflags & XFS_GQUOTA_ACCT) seq_puts(m, "," MNTOPT_GQUOTANOENF); The problem is that XFS_OQUOTA_ENFD will be set in mp->m_qflags if either group or project quotas are enforced, and as a result both MNTOPT_PRJQUOTA and MNTOPT_GRPQUOTA will be shown as mount options. Signed-off-by: Alex Elder <aelder@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com>
2009-09-01xfs: un-static xfs_inobt_lookupChristoph Hellwig
xfs_inobt_lookup is also used in xfs_itable.c, remove the STATIC modifier from it's declaration to fix non-debug builds. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-09-01xfs: actually enable the swapext compat handlerChristoph Hellwig
Fix a small typo in the compat ioctl handler that cause the swapext compat handler to never be called. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Torsten Kaiser <just.for.lkml@googlemail.com> Tested-by: Torsten Kaiser <just.for.lkml@googlemail.com> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-09-01xfs: actually enable the swapext compat handlerChristoph Hellwig
Fix a small typo in the compat ioctl handler that cause the swapext compat handler to never be called. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Torsten Kaiser <just.for.lkml@googlemail.com> Tested-by: Torsten Kaiser <just.for.lkml@googlemail.com> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-09-01xfs: simplify xfs_trans_igetChristoph Hellwig
xfs_trans_iget is a wrapper for xfs_iget that adds the inode to the transaction after it is read. Except when the inode already is in the inode cache, in which case it returns the existing locked inode with increment lock recursion counts. Now, no one in the tree every decrements these lock recursion counts, so any user of this gets a potential double unlock when both the original owner of the inode and the xfs_trans_iget caller unlock it. When looking back in a git bisect in the historic XFS tree there was only one place that decremented these counts, xfs_trans_iput. Introduced in commit ca25df7a840f426eb566d52667b6950b92bb84b5 by Adam Sweeney in 1993, and removed in commit 19f899a3ab155ff6a49c0c79b06f2f61059afaf3 by Steve Lord in 2003. And as long as it didn't slip through git bisects cracks never actually used in that time frame. A quick audit of the callers of xfs_trans_iget shows that no caller really relies on this behaviour fortunately - xfs_ialloc allows this inode from disk so it must not be there before, and all the RT allocator routines only every add each RT bitmap inode once. In addition to removing lots of code and reducing the size of the inode item this patch also avoids the double inode cache lookup in each create/mkdir/mknod transaction. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-09-01xfs: merge fsync and O_SYNC handlingChristoph Hellwig
The guarantees for O_SYNC are exactly the same as the ones we need to make for an fsync call (and given that Linux O_SYNC is O_DSYNC the equivalent is fdadatasync, but we treat both the same in XFS), except with a range data writeout. Jan Kara has started unifying these two path for filesystems using the generic helpers, and I've started to look at XFS. The actual transaction commited by xfs_fsync and xfs_write_sync_logforce has a different transaction number, but actually is exactly the same. We'll only use the fsync transaction going forward. One major difference is that xfs_write_sync_logforce never issues a cache flush unless we commit a transaction causing that as a side-effect, which is an obvious bug in the O_SYNC handling. Second all the locking and i_update_size vs i_update_core changes from 978b7237123d007b9fa983af6e0e2fa8f97f9934 never made it to xfs_write_sync_logforce, so we add them back. To make xfs_fsync easily usable from the O_SYNC path, the filemap_fdatawait call is moved up to xfs_file_fsync, so that we don't wait on the whole file after we already waited for our portion in xfs_write. We'll also use a plain call to filemap_write_and_wait_range instead of the previous sync_page_rang which did it in two steps including an half-hearted inode write out that doesn't help us. Once we're done with this also remove the now useless i_update_size tracking. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-09-01xfs: speed up free inode searchDave Chinner
Don't search too far - abort if it is outside a certain radius and simply do a linear search for the first free inode. In AGs with a million inodes this can speed up allocation speed by 3-4x. [hch: ported to the new xfs_ialloc.c world order] Signed-off-by: Dave Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-09-01xfs: rationalize xfs_inobt_lookup*Christoph Hellwig
Currenly we have a xfs_inobt_lookup* variant for each comparism direction, and all these get all three fields of the inobt records passed, while the common case is just looking for the inode number and we have only marginally more callers than xfs_inobt_lookup* variants. So opencode a direct call to xfs_btree_lookup for the single case where we need all fields, and replace xfs_inobt_lookup* with a xfs_inobt_looku that just takes the inode number and the direction for all other callers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-09-01xfs: untangle xfs_diallocChristoph Hellwig
Clarify the control flow in xfs_dialloc. Factor out a helper to go to the next node from the current one and improve the control flow by expanding composite if statements and using gotos. The xfs_ialloc_next_rec helper is borrowed from Dave Chinners dynamic allocation policy patches. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-09-01xfs: factor out debug checks from xfs_dialloc and xfs_difreeDave Chinner
Factor out a common helper from repeated debug checks in xfs_dialloc and xfs_difree. [hch: split out from Dave's dynamic allocation policy patches] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-09-01xfs: improve xfs_inobt_update prototypeChristoph Hellwig
Both callers of xfs_inobt_update have the record in form of a xfs_inobt_rec_incore_t, so just pass a pointer to it instead of the individual variables. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-09-01xfs: improve xfs_inobt_get_rec prototypeChristoph Hellwig
Most callers of xfs_inobt_get_rec need to fill a xfs_inobt_rec_incore_t, and those who don't yet are fine with a xfs_inobt_rec_incore_t, instead of the three individual variables, too. So just change xfs_inobt_get_rec to write the output into a xfs_inobt_rec_incore_t directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-09-01xfs: factor out inode initialisationDave Chinner
Factor out code to initialize new inode clusters into a function of it's own. This keeps xfs_ialloc_ag_alloc smaller and better structured and enables a future inode cluster initialization transaction. Also initialize the agno variable earlier in xfs_ialloc_ag_alloc to avoid repeated byte swaps. [hch: The original patch is from Dave from his unpublished inode create transaction patch series, with some modifcations by me to apply stand-alone] Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-08-31fs/xfs: Correct redundant testJulia Lawall
bp was tested for NULL a few lines before, followed by a return, and there is no intervening modification of its value. A simplified version of the semantic match that finds this problem is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @r exists@ local idexpression x; expression E; position p1,p2; @@ if (x == NULL || ...) { ... when forall return ...; } ... when != \(x=E\|x--\|x++\|--x\|++x\|x-=E\|x+=E\|x|=E\|x&=E\|&x\) ( *x == NULL | *x != NULL ) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Acked-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-08-31xfs: remove XFS_INO64_OFFSETEric Sandeen
Commit a19d9f887d81106d52cacbc9930207b487e07e0e removed the ino64 option but left the XFS_INO64_OFFSET define it used in place - just remove it. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-08-31un-static xfs_read_agfEric Sandeen
CONFIG_XFS_DEBUG builds still need xfs_read_agf to be non-static, oops. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-08-31xfs: add more statics & drop some unused functionsEric Sandeen
A lot more functions could be made static, but they need forward declarations; this does some easy ones, and also found a few unused functions in the process. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-08-17xfs: fix locking in xfs_iget_cache_hitChristoph Hellwig
The locking in xfs_iget_cache_hit currently has numerous problems: - we clear the reclaim tag without i_flags_lock which protects modifications to it - we call inode_init_always which can sleep with pag_ici_lock held (this is oss.sgi.com BZ #819) - we acquire and drop i_flags_lock a lot and thus provide no consistency between the various flags we set/clear under it This patch fixes all that with a major revamp of the locking in the function. The new version acquires i_flags_lock early and only drops it once we need to call into inode_init_always or before calling xfs_ilock. This patch fixes a bug seen in the wild where we race modifying the reclaim tag. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-08-12Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfsLinus Torvalds
* 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: fix spin_is_locked assert on uni-processor builds xfs: check for dinode realtime flag corruption use XFS_CORRUPTION_ERROR in xfs_btree_check_sblock xfs: switch to NOFS allocation under i_lock in xfs_attr_rmtval_get xfs: switch to NOFS allocation under i_lock in xfs_readlink_bmap xfs: switch to NOFS allocation under i_lock in xfs_attr_rmtval_set xfs: switch to NOFS allocation under i_lock in xfs_buf_associate_memory xfs: switch to NOFS allocation under i_lock in xfs_dir_cilookup_result xfs: switch to NOFS allocation under i_lock in xfs_da_buf_make xfs: switch to NOFS allocation under i_lock in xfs_da_state_alloc xfs: switch to NOFS allocation under i_lock in xfs_getbmap xfs: avoid memory allocation under m_peraglock in growfs code
2009-08-12xfs: fix spin_is_locked assert on uni-processor buildsChristoph Hellwig
Without SMP or preemption spin_is_locked always returns false, so we can't do an assert with it. Instead use assert_spin_locked, which does the right thing on all builds. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Reported-by: Johannes Engel <jcnengel@googlemail.com> Tested-by: Johannes Engel <jcnengel@googlemail.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-08-12xfs: check for dinode realtime flag corruptionChristoph Hellwig
Ramon tested XFS with a modified version of fsfuzzer and hit a NULL pointer dereference in __xfs_get_blocks due to the RT device target pointer being NULL. To fix this reject inode with the realtime bit set on a a filesystem without an RT subvolume during inode read. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Felix Blyakher <felixb@sgi.com> Reported-by: Ramon de Carvalho Valle <ramon@risesecurity.org> Tested-by: Ramon de Carvalho Valle <ramon@risesecurity.org> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-08-12use XFS_CORRUPTION_ERROR in xfs_btree_check_sblockEric Sandeen
In Red Hat Bug 512552 - Can't write to XFS mount during raid5 resync a user ran into corruption while resyncing a raid, and we failed a consistency test, but didn't get much more info; it'd be nice to call XFS_CORRUPTION_ERROR here so we can see the buffer contents. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-08-12xfs: switch to NOFS allocation under i_lock in xfs_attr_rmtval_getChristoph Hellwig
xfs_attr_rmtval_get is always called with i_lock held, but i_lock is taken in reclaim context so all allocations under it must avoid recursions into the filesystem. Reported by the new reclaim context tracing in lockdep. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-08-12xfs: switch to NOFS allocation under i_lock in xfs_readlink_bmapChristoph Hellwig
xfs_readlink_bmap is called with i_lock held, but i_lock is taken in reclaim context so all allocations under it must avoid recursions into the filesystem. Reported by the new reclaim context tracing in lockdep. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-08-12xfs: switch to NOFS allocation under i_lock in xfs_attr_rmtval_setChristoph Hellwig
xfs_attr_rmtval_set is always called with i_lock held, and i_lock is taken in reclaim context so all allocations under it must avoid recursions into the filesystem. Reported by the new reclaim context tracing in lockdep. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-08-12xfs: switch to NOFS allocation under i_lock in xfs_buf_associate_memoryChristoph Hellwig
xfs_buf_associate_memory is used for setting up the spare buffer for the log wrap case in xlog_sync which can happen under i_lock when called from xfs_fsync. The i_lock mutex is taken in reclaim context so all allocations under it must avoid recursions into the filesystem. There are a couple more uses of xfs_buf_associate_memory in the log recovery code that are also affected by this, but I'd rather keep the code simple than passing on a gfp_mask argument. Longer term we should just stop requiring the memoery allocation in xlog_sync by some smaller rework of the buffer layer. Reported by the new reclaim context tracing in lockdep. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-08-12xfs: switch to NOFS allocation under i_lock in xfs_dir_cilookup_resultChristoph Hellwig
xfs_dir_cilookup_result is always called with i_lock held, but i_lock is taken in reclaim context so all allocations under it must avoid recursions into the filesystem. Reported by the new reclaim context tracing in lockdep. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-08-12xfs: switch to NOFS allocation under i_lock in xfs_da_buf_makeChristoph Hellwig
i_lock is taken in the reclaim context so all allocations under it must avoid recursions into the filesystem. Reported by the new reclaim context tracing in lockdep. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-08-12xfs: switch to NOFS allocation under i_lock in xfs_da_state_allocChristoph Hellwig
xfs_da_state_alloc is always called with i_lock held, but i_lock is taken in reclaim context so all allocations under it must avoid recursions into the filesystem. Reported by the new reclaim context tracing in lockdep. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-08-12xfs: switch to NOFS allocation under i_lock in xfs_getbmapChristoph Hellwig
xfs_getbmap allocates memory with i_lock held, but i_lock is taken in reclaim context so all allocations under it must avoid recursions into the filesystem. Reported by the new reclaim context tracing in lockdep. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-08-12xfs: avoid memory allocation under m_peraglock in growfs codeChristoph Hellwig
Allocate the memory for the larger m_perag array before taking the per-AG lock as the per-AG lock can be taken under the i_lock which can be taken from reclaim context. Reported by the new reclaim context tracing in lockdep. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-08-07xfs: fix freeing of inodes not yet added to the inode cacheChristoph Hellwig
When freeing an inode that lost race getting added to the inode cache we must not call into ->destroy_inode, because that would delete the inode that won the race from the inode cache radix tree. This patch uses splits a new xfs_inode_free helper out of xfs_ireclaim and uses that plus __destroy_inode to make sure we really only free the memory allocted for the inode that lost the race, and not mess with the inode cache state. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Reported-by: Alex Samad <alex@samad.com.au> Reported-by: Andrew Randrianasulu <randrik@mail.ru> Reported-by: Stephane <sharnois@max-t.com> Reported-by: Tommy <tommy@news-service.com> Reported-by: Miah Gregory <mace@darksilence.net> Reported-by: Gabriel Barazer <gabriel@oxeva.fr> Reported-by: Leandro Lucarella <llucax@gmail.com> Reported-by: Daniel Burr <dburr@fami.com.au> Reported-by: Nickolay <newmail@spaces.ru> Reported-by: Michael Guntsche <mike@it-loops.com> Reported-by: Dan Carley <dan.carley+linuxkern-bugs@gmail.com> Reported-by: Michael Ole Olsen <gnu@gmx.net> Reported-by: Michael Weissenbacher <mw@dermichi.com> Reported-by: Martin Spott <Martin.Spott@mgras.net> Reported-by: Christian Kujau <lists@nerdbynature.de> Tested-by: Michael Guntsche <mike@it-loops.com> Tested-by: Dan Carley <dan.carley+linuxkern-bugs@gmail.com> Tested-by: Christian Kujau <lists@nerdbynature.de>
2009-08-07vfs: fix inode_init_always calling conventionChristoph Hellwig
Currently inode_init_always calls into ->destroy_inode if the additional initialization fails. That's not only counter-intuitive because inode_init_always did not allocate the inode structure, but in case of XFS it's actively harmful as ->destroy_inode might delete the inode from a radix-tree that has never been added. This in turn might end up deleting the inode for the same inum that has been instanciated by another process and cause lots of cause subtile problems. Also in the case of re-initializing a reclaimable inode in XFS it would free an inode we still want to keep alive. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
2009-07-31Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfsLinus Torvalds
* 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: bump up nr_to_write in xfs_vm_writepage xfs: reduce bmv_count in xfs_vn_fiemap
2009-07-31xfs: bump up nr_to_write in xfs_vm_writepageEric Sandeen
VM calculation for nr_to_write seems off. Bump it way up, this gets simple streaming writes zippy again. To be reviewed again after Jens' writeback changes. Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Cc: Chris Mason <chris.mason@oracle.com> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-07-31xfs: reduce bmv_count in xfs_vn_fiemapEric Sandeen
commit 6321e3ed2acf3ee9643cdd403e1c88605d7944ba caused the full bmv_count's worth of getbmapx structures to get allocated; telling it to do MAXEXTNUM was a bit insane, resulting in ENOMEM every time. Chop it down to something reasonable, the number of slots in the caller's input buffer. If this is too large the caller may get ENOMEM but the reason should not be a mystery, and they can try again with something smaller. We add 1 to the value because in the normal getbmap world, bmv_count includes the header and xfs_getbmap does: nex = bmv->bmv_count - 1; if (nex <= 0) return XFS_ERROR(EINVAL); Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Olaf Weber <olaf@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-07-12headers: smp_lock.h reduxAlexey Dobriyan
* Remove smp_lock.h from files which don't need it (including some headers!) * Add smp_lock.h to files which do need it * Make smp_lock.h include conditional in hardirq.h It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT This will make hardirq.h inclusion cheaper for every PREEMPT=n config (which includes allmodconfig/allyesconfig, BTW) Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>