Age | Commit message (Collapse) | Author |
|
|
|
This patch rips out the XFS ACL handling code and uses the generic
fs/posix_acl.c code instead. The ondisk format is of course left
unchanged.
This also introduces the same ACL caching all other Linux filesystems do
by adding pointers to the acl and default acl in struct xfs_inode.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
|
|
In commit code, we scan buffers attached to a transaction. During this
scan, we sometimes have to drop j_list_lock and then we recheck whether
the journal buffer head didn't get freed by journal_try_to_free_buffers().
But checking for buffer_jbd(bh) isn't enough because a new journal head
could get attached to our buffer head. So add a check whether the journal
head remained the same and whether it's still at the same transaction and
list.
This is a nasty bug and can cause problems like memory corruption (use after
free) or trigger various assertions in JBD code (observed).
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: <stable@kernel.org>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The recent ->lookup() deadlock correction required the directory inode
mutex to be dropped while waiting for expire completion. We were
concerned about side effects from this change and one has been identified.
I saw several error messages.
They cause autofs to become quite confused and don't really point to the
actual problem.
Things like:
handle_packet_missing_direct:1376: can't find map entry for (43,1827932)
which is usually totally fatal (although in this case it wouldn't be
except that I treat is as such because it normally is).
do_mount_direct: direct trigger not valid or already mounted
/test/nested/g3c/s1/ss1
which is recoverable, however if this problem is at play it can cause
autofs to become quite confused as to the dependencies in the mount tree
because mount triggers end up mounted multiple times. It's hard to
accurately check for this over mounting case and automount shouldn't need
to if the kernel module is doing its job.
There was one other message, similar in consequence of this last one but I
can't locate a log example just now.
When checking if a mount has already completed prior to adding a new mount
request to the wait queue we check if the dentry is hashed and, if so, if
it is a mount point. But, if a mount successfully completed while we
slept on the wait queue mutex the dentry must exist for the mount to have
completed so the test is not really needed.
Mounts can also be done on top of a global root dentry, so for the above
case, where a mount request completes and the wait queue entry has already
been removed, the hashed test returning false can cause an incorrect
callback to the daemon. Also, d_mountpoint() is not sufficient to check
if a mount has completed for the multi-mount case when we don't have a
real mount at the base of the tree.
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
SYNC_BDFLUSH is a leftover from IRIX and rather misnamed for todays
code. Make xfs_sync_fsdata and xfs_dq_sync use the SYNC_TRYLOCK flag
for not blocking on logs just as the inode sync code already does.
For xfs_sync_fsdata it's a trivial 1:1 replacement, but for xfs_qm_sync
I use the opportunity to decouple the non-blocking lock case from the
different flushing modes, similar to the inode sync code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
|
|
We want to wait for all I/O to finish when we do data integrity syncs. So
there is no reason to keep SYNC_WAIT separate from SYNC_IOWAIT. This
causes a little change in behaviour for the ENOSPC flushing code which now
does a second submission and wait of buffered I/O, but that should finish
ASAP as we already did an asynchronous writeout earlier.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
|
|
xfs_sync_inodes is used to write back either file data or inode metadata.
In general we always do these separately, except for one fishy case in
xfs_fs_put_super that does both. So separate xfs_sync_inodes into
separate xfs_sync_data and xfs_sync_attr functions. In xfs_fs_put_super
we first call the data sync and then the attr sync as that was the previous
order. The moved log force in that path doesn't make a difference because
we will force the log again as part of the real unmount process.
The filesystem readonly checks are not performed by the new function but
instead moved into the callers, given that most callers alredy have it
further up in the stack. Also add debug checks that we do not pass in
incorrect flags in the new xfs_sync_data and xfs_sync_attr function and
fix the one place that did pass in a wrong flag.
Also remove a comment mentioning xfs_sync_inodes that has been incorrect
for a while because we always take either the iolock or ilock in the
sync path these days.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
|
|
Use xfs_inode_ag_iterator instead of opencoding the inode walk in the
quota code. Mark xfs_inode_ag_iterator and xfs_sync_inode_valid non-static
to allow using them from the quota code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
|
|
Given that we walk across the per-ag inode lists so often, it makes sense to
introduce an iterator for this.
Convert the sync and reclaim code to use this new iterator, quota code will
follow in the next patch.
Also change xfs_reclaim_inode to return -EGAIN instead of 1 for an inode
already under reclaim. This simplifies the AG iterator and doesn't
matter for the only other caller.
[hch: merged the lookup and execute callbacks back into one to get the
pag_ici_lock locking correct and simplify the code flow]
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
|
|
The noblock parameter of xfs_reclaim_inodes is only ever set to zero. Remove
it and all the conditional code that is never executed.
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
|
|
Separate the validation of inodes found by the radix
tree walk from the radix tree lookup.
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
|
|
In many cases we only want to sync inode metadata. Split out the inode
flushing into a separate helper to prepare factoring the inode sync code.
Based on a patch from Dave Chinner, but redone to keep the current behaviour
exactly and leave changes to the flushing logic to another patch.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
|
|
In many cases we only want to sync inode data. Start spliting the inode sync
into data sync and inode sync by factoring out the inode data flush.
[hch: minor cleanups]
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
|
|
Kill the quota ops function vector and replace it with direct calls or
stubs in the CONFIG_XFS_QUOTA=n case.
Make sure we check XFS_IS_QUOTA_RUNNING in the right spots. We can remove
the number of those checks because the XFS_TRANS_DQ_DIRTY flag can't be set
otherwise.
This brings us back closer to the way this code worked in IRIX and earlier
Linux versions, but we keep a lot of the more useful factoring of common
code.
Eventually we should also kill xfs_qm_bhv.c, but that's left for a later
patch.
Reduces the size of the source code by about 250 lines and the size of
XFS module by about 1.5 kilobytes with quotas enabled:
text data bss dec hex filename
615957 2960 3848 622765 980ad fs/xfs/xfs.o
617231 3152 3848 624231 98667 fs/xfs/xfs.o.old
Fallout:
- xfs_qm_dqattach is split into xfs_qm_dqattach_locked which expects
the inode locked and xfs_qm_dqattach which does the locking around it,
thus removing XFS_QMOPT_ILOCKED.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
|
|
Arkadiusz has seen really strange crashes in xfs_qm_dqcheck that
I can only explain by a log item being too smal to actually fit the
xfs_dqblk_t we're dereferencing all over xfs_qm_dqcheck. So add
graceful checks for NULL or too small quota items to the log recovery
code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
|
|
Commit a6634fba3dec4a92f0a2c4e30c80b634c0576ad5 in xfsprogs increased the
maximum log size supported by mkfs. Merged back the changes to xfs_fs.h
so the growfs enforced the same limit and the headers are in sync.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
|
|
CONFIG_IMA=y inode activity leaks iint_cache and radix_tree_node objects
until the system runs out of memory. Nowhere is calling ima_inode_free()
a.k.a. ima_iint_delete(). Fix that by calling it from destroy_inode().
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
OK, that's probably the easiest way to do that, as much as I don't like it...
Since iget() et.al. will not accept I_FREEING (will wait to go away
and restart), and since we'd better have serialization between new/free
on fs data structures anyway, we can afford simply skipping I_FREEING
et.al. in insert_inode_locked().
We do that from new_inode, so it won't race with free_inode in any interesting
ways and it won't race with iget (of any origin; nfsd or in case of fs
corruption a lookup) since both still will wait for I_LOCK.
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: Jan Kara <jack@suse.cz>
Tested-by: David Watson <dbwatson@ukfsn.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
The nobh_truncate_page() function is used by ext2, exofs, and jfs. Of
these three, only ext2 and jfs's get_block() function pays attention
to bh->b_size --- which is normally always the filesystem blocksize
except when the get_block() function is called by either
mpage_readpage(), mpage_readpages(), or the direct I/O routines in
fs/direct_io.c.
Unfortunately, nobh_truncate_page() does not initialize map_bh before
calling the filesystem-supplied get_block() function. So ext2 and jfs
will try to calculate the number of blocks to map by taking stack
garbage and shifting it left by inode->i_blkbits. This should be
*mostly* harmless (except the filesystem will do some unnneeded work)
unless the stack garbage is less than filesystem's blocksize, in which
case maxblocks will be zero, and the attempt to find out whether or
not the filesystem has a hole at a given logical block will fail, and
the page cache entry might not get zero'ed out.
Also if the stack garbage in in map_bh->state happens to have the
BH_Mapped bit set, there could be an attempt to call readpage() on a
non-existent page, which could cause nobh_truncate_page() to return an
error when it should not.
Fix this by initializing map_bh->state and map_bh->size.
Fortunately, it's probably fairly unlikely that ext2 and jfs users
mount with nobh these days.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
Btrfs: Fix oops and use after free during space balancing
Btrfs: set device->total_disk_bytes when adding new device
|
|
The btrfs allocator uses list_for_each to walk the available block
groups when searching for free blocks. It starts off with a hint
to help find the best block group for a given allocation.
The hint is resolved into a block group, but we don't properly check
to make sure the block group we find isn't in the middle of being
freed due to filesystem shrinking or balancing. If it is being
freed, the list pointers in it are bogus and can't be trusted. But,
the code happily goes along and uses them in the list_for_each loop,
leading to all kinds of fun.
The fix used here is to check to make sure the block group we find really
is on the list before we use it. list_del_init is used when removing
it from the list, so we can do a proper check.
The allocation clustering code has a similar bug where it will trust
the block group in the current free space cluster. If our allocation
flags have changed (going from single spindle dup to raid1 for example)
because the drives in the FS have changed, we're not allowed to use
the old block group any more.
The fix used here is to check the current cluster against the
current allocation flags.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
It was not being properly initialized, and so the size saved to
disk was not correct.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
xfs: prevent deadlock in xfs_qm_shake()
xfs: fix overflow in xfs_growfs_data_private
xfs: fix double unlock in xfs_swap_extents()
|
|
It's possible to recurse into filesystem from the memory
allocation, which deadlocks in xfs_qm_shake(). Add check
for __GFP_FS, and bail out if it is not set.
Signed-off-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Hedi Berriche <hedi@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
|
|
In the case where growing a filesystem would leave the last AG
too small, the fixup code has an overflow in the calculation
of the new size with one fewer ag, because "nagcount" is a 32
bit number. If the new filesystem has > 2^32 blocks in it
this causes a problem resulting in an EINVAL return from growfs:
# xfs_io -f -c "truncate 19998630180864" fsfile
# mkfs.xfs -f -bsize=4096 -dagsize=76288719b,size=3905982455b fsfile
# mount -o loop fsfile /mnt
# xfs_growfs /mnt
meta-data=/dev/loop0 isize=256 agcount=52,
agsize=76288719 blks
= sectsz=512 attr=2
data = bsize=4096 blocks=3905982455, imaxpct=5
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal bsize=4096 blocks=32768, version=2
= sectsz=512 sunit=0 blks, lazy-count=0
realtime =none extsz=4096 blocks=0, rtextents=0
xfs_growfs: XFS_IOC_FSGROWFSDATA xfsctl failed: Invalid argument
Reported-by: richard.ems@cape-horn-eng.com
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
|
|
Regreesion from commit ef8f7fc, which rearranged the code in
xfs_swap_extents() leading to double unlock of xfs inode ilock.
That resulted in xfs_fsr deadlocking itself on platforms, which
don't handle double unlock of rw_semaphore nicely. It caused the
count go negative, which represents the write holder, without
really having one. ia64 is one of the platforms where deadlock
was easily reproduced and the fix was tested.
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
|
|
It's possible to recurse into filesystem from the memory
allocation, which deadlocks in xfs_qm_shake(). Add check
for __GFP_FS, and bail out if it is not set.
Signed-off-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Hedi Berriche <hedi@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
nilfs2: fix bh leak in nilfs_cpfile_delete_checkpoints function
|
|
The nilfs_cpfile_delete_checkpoints() wrongly skips brelse() for the
header block of checkpoint file in case of errors. This fixes the
leak bug.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
|
|
* git://git.infradead.org/~dwmw2/mtd-2.6.30:
jffs2: Fix corruption when flash erase/write failure
mtd: MXC NAND driver fixes (v5)
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
Driver Core: do not oops when driver_unregister() is called for unregistered drivers
sysfs: file.c: use create_singlethread_workqueue()
|
|
* 'for-2.6.30' of git://linux-nfs.org/~bfields/linux:
svcrdma: dma unmap the correct length for the RPCRDMA header page.
nfsd: Revert "svcrpc: take advantage of tcp autotuning"
nfsd: fix hung up of nfs client while sync write data to nfs server
|
|
The flat loader uses an architecture's flat_stack_align() to align the
stack but assumes word-alignment is enough for the data sections.
However, on the Xtensa S6000 we have registers up to 128bit width
which can be used from userspace and therefor need userspace stack and
data-section alignment of at least this size.
This patch drops flat_stack_align() and uses the same alignment that
is required for slab caches, ARCH_SLAB_MINALIGN, or wordsize if it's
not defined by the architecture.
It also fixes m32r which was obviously kaput, aligning an
uninitialized stack entry instead of the stack pointer.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Oskar Schirmer <os@emlix.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Bryan Wu <cooloney@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Cc: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Johannes Weiner <jw@emlix.com>
Acked-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
proc_pident_instantiate() has following call flow.
proc_pident_lookup()
proc_pident_instantiate()
proc_pid_make_inode()
And, proc_pident_lookup() has following error handling.
const struct pid_entry *p, *last;
error = ERR_PTR(-ENOENT);
if (!task)
goto out_no_task;
Then, proc_pident_instantiate should return ENOENT too when racing against
exit(2) occur.
EINAL has two bad reason.
- it implies caller is wrong. bad the race isn't caller's mistake.
- man 2 open don't explain EINVAL. user often don't handle it.
Note: Other proc_pid_make_inode() caller already use ENOENT properly.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Erase errors such as:
"Newly-erased block contained word 0xa4ef223e at offset 0x0296a014"
and failure to write the clean marker,
moves the offending erase block to erasing list before calling
jffs2_erase_failed(). This is bad as jffs2_erase_failed() will
also move the block to the bad_list, but is now moving the
wrong block, causing FS corruption.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
We don't need a kernel thread per CPU for this application.
Acked-by: Alex Chiang <achiang@hp.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Commit 'Short write in nfsd becomes a full write to the client'
(31dec2538e45e9fff2007ea1f4c6bae9f78db724) broken the sync write.
With the following commands to reproduce:
$ mount -t nfs -o sync 192.168.0.21:/nfsroot /mnt
$ cd /mnt
$ echo aaaa > temp.txt
Then nfs client is hung up.
In SYNC mode the server alaways return the write count 0 to the
client. This is because the value of host_err in nfsd_vfs_write()
will be overwrite in SYNC mode by 'host_err=nfsd_sync(file);',
and then we return host_err(which is now 0) as write count.
This patch fixed the problem.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
|
Fix up renamed filenames in comments in fs/cachefiles/internal.h.
Originally, the files were all called cf-xxx.c, but they got renamed to
just xxx.c.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Fix up renamed filenames in comments in fs/fscache/internal.h.
Originally, the files were all called fsc-xxx.c, but they got renamed to
just xxx.c.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
In the case where growing a filesystem would leave the last AG
too small, the fixup code has an overflow in the calculation
of the new size with one fewer ag, because "nagcount" is a 32
bit number. If the new filesystem has > 2^32 blocks in it
this causes a problem resulting in an EINVAL return from growfs:
# xfs_io -f -c "truncate 19998630180864" fsfile
# mkfs.xfs -f -bsize=4096 -dagsize=76288719b,size=3905982455b fsfile
# mount -o loop fsfile /mnt
# xfs_growfs /mnt
meta-data=/dev/loop0 isize=256 agcount=52,
agsize=76288719 blks
= sectsz=512 attr=2
data = bsize=4096 blocks=3905982455, imaxpct=5
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal bsize=4096 blocks=32768, version=2
= sectsz=512 sunit=0 blks, lazy-count=0
realtime =none extsz=4096 blocks=0, rtextents=0
xfs_growfs: XFS_IOC_FSGROWFSDATA xfsctl failed: Invalid argument
Reported-by: richard.ems@cape-horn-eng.com
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
|
|
If the asynchronous lease renewal fails (usually due to a soft timeout),
then we _must_ schedule state recovery in order to ensure that we don't
lose the lease unnecessarily or, if the lease is already lost, that we
recover the locking state promptly...
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
fix build error with latest kbuild adjustments to initconst.
The commit a447c0932445f92ce6f4c1bd020f62c5097a7842 ("vfs: Use
const for kernel parser table") changed:
static match_table_t __initdata tokens = {
to
static match_table_t __initconst tokens = {
But the missing const causes popwerpc to fail with latest
updates to __initconst like this:
fs/nfs/nfsroot.c:400: error: __setup_str_nfs_root_setup causes a section type conflict
fs/nfs/nfsroot.c:400: error: __setup_str_nfs_root_setup causes a section type conflict
The bug is only present with kbuild-next.
Following patch has been build tested.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
[CIFS] Avoid open on possible directories since Samba now rejects them
|
|
Small change (mostly formatting) to limit lookup based open calls to
file create only.
After discussion yesteday on samba-technical about the posix lookup
regression, and looking at a problem with cifs posix open to one
particular Samba version, Jeff and JRA realized that Samba server's
behavior changed in this area (posix open behavior on files vs.
directories). To make this behavior consistent, JRA just made a
fix to Samba server to alter how it handles open of directories (now
returning the equivalent of EISDIR instead of success). Since we don't
know at lookup time whether the inode is a directory or file (and
thus whether posix open will succeed with most current Samba server),
this change avoids the posix open code on lookup open (just issues
posix open on creates). This gets the semantic benefits we want
(atomicity, posix byte range locks, improved write semantics on newly
created files) and file create still is fast, and we avoid the problem
that Jeff noticed yesterday with "openat" (and some open directory
calls) of non-cached directories to one version of Samba server, and
will work with future Samba versions (which include the fix jra just
pushed into Samba server). I confirmed this approach with jra
yesterday and with Shirish today.
Posix open is only called (at lookup time) for file create now.
For opens (rather than creates), because we do not know if it
is a file or directory yet, and current Samba no longer allows
us to do posix open on dirs, we could end up wasting an open call
on what turns out to be a dir. For file opens, we wait to call posix
open till cifs_open. It could be added here (lookup) in the future
but the performance tradeoff of the extra network request when EISDIR
or EACCES is returned would have to be weighed against the 50%
reduction in network traffic in the other paths.
Reviewed-by: Shirish Pargaonkar <shirishp@us.ibm.com>
Tested-by: Jeff Layton <jlayton@redhat.com>
CC: Jeremy Allison <jra@samba.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
nilfs2: fix memory leak in nilfs_ioctl_clean_segments
|
|
This fixes a new memory leak problem in garbage collection. The
problem was brought by the bugfix patch ("nilfs2: fix lock order
reversal in nilfs_clean_segments ioctl").
Thanks to Kentaro Suzuki for finding this problem.
Reported-by: Kentaro Suzuki <k_suzuki@ms.sylc.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
|
|
Posix open code was not properly adding the file to the
list of open files. Fix allocating cifsFileInfo
more than once, and adding twice to flist and tlist.
Also fix mode setting to be done in one place in these
paths.
Signed-off-by: Steve French <sfrench@us.ibm.com>
Reviewed-by: Shirish Pargaonkar <shirishp@us.ibm.com>
Tested-by: Jeff Layton <jlayton@redhat.com>
Tested-by: Luca Tettamanti <kronos.it@gmail.com>
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
cifs: fix pointer initialization and checks in cifs_follow_symlink (try #4)
|
|
This is the third respin of the patch posted yesterday to fix the error
handling in cifs_follow_symlink. It also includes a fix for a bogus NULL
pointer check in CIFSSMBQueryUnixSymLink that Jeff Moyer spotted.
It's possible for CIFSSMBQueryUnixSymLink to return without setting
target_path to a valid pointer. If that happens then the current value
to which we're initializing this pointer could cause an oops when it's
kfree'd.
This patch is a little more comprehensive than the last patches. It
reorganizes cifs_follow_link a bit for (hopefully) better readability.
It should also eliminate the uneeded allocation of full_path on servers
without unix extensions (assuming they can get to this point anyway, of
which I'm not convinced).
On a side note, I'm not sure I agree with the logic of enabling this
query even when unix extensions are disabled on the client. It seems
like that should disable this as well. But, changing that is outside the
scope of this fix, so I've left it alone for now.
Reported-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Reviewed-by: Christoph Hellwig <hch@inraded.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
|
|
The problem is that permission checking is skipped if atomic open is
possible, but when exec opens a file, it just opens it O_READONLY which
means EXEC permission will not be checked at that time.
This problem is observed by the following sequence (executed as root):
mount -t nfs4 server:/ /mnt4
echo "ls" >/mnt4/foo
chmod 744 /mnt4/foo
su guest -c "mnt4/foo"
Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
Tested-by: Eugene Teo <eugeneteo@kernel.sg>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|