aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2009-07-15jbd: Fix a race between checkpointing code and journal_get_write_access()Jan Kara
The following race can happen: CPU1 CPU2 checkpointing code checks the buffer, adds it to an array for writeback do_get_write_access() ... lock_buffer() unlock_buffer() flush_batch() submits the buffer for IO __jbd_journal_file_buffer() So a buffer under writeout is returned from do_get_write_access(). Since the filesystem code relies on the fact that journaled buffers cannot be written out, it does not take the buffer lock and so it can modify buffer while it is under writeout. That can lead to a filesystem corruption if we crash at the right moment. The similar problem can happen with the journal_get_create_access() path. We fix the problem by clearing the buffer dirty bit under buffer_lock even if the buffer is on BJ_None list. Actually, we clear the dirty bit regardless the list the buffer is in and warn about the fact if the buffer is already journalled. Thanks for spotting the problem goes to dingdinghua <dingdinghua85@gmail.com>. Reported-by: dingdinghua <dingdinghua85@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2009-07-15ext3: Fix truncation of symlinks after failed writeJan Kara
Contents of long symlinks is written via standard write methods. So when the write fails, we add inode to orphan list. But symlinks don't have .truncate method defined so nobody properly removes them from the orphan list (both on disk and in memory). Fix this by calling ext3_truncate() directly instead of calling vmtruncate() (which is saner anyway since we don't need anything vmtruncate() does except from calling .truncate in these paths). We also add inode to orphan list only if ext3_can_truncate() is true (currently, it can be false for symlinks when there are no blocks allocated) - otherwise orphan list processing will complain and ext3_truncate() will not remove inode from on-disk orphan list. Signed-off-by: Jan Kara <jack@suse.cz>
2009-07-15jbd: Fail to load a journal if it is too shortJan Kara
Due to on disk corruption, it can happen that journal is too short. Fail to load it in such case so that we don't oops somewhere later. Reported-by: Nageswara R Sastry <rnsastry@linux.vnet.ibm.com> Signed-off-by: Jan Kara <jack@suse.cz>
2009-07-14Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm: dlm: free socket in error exit path dlm: fix plock use-after-free dlm: Fix uninitialised variable warning in lock.c
2009-07-14Merge branch 'tracing-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tracing/function-profiler: do not free per cpu variable stat tracing/events: Move TRACE_SYSTEM outside of include guard
2009-07-149p: Fix incorrect parameters to v9fs_file_readn.Abhishek Kulkarni
Fix v9fs_vfs_readpage. The offset and size parameters to v9fs_file_readn were interchanged and hence passed incorrectly. Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2009-07-14dlm: free socket in error exit pathCasey Dahlin
In the tcp_connect_to_sock() error exit path, the socket allocated at the top of the function was not being freed. Signed-off-by: Casey Dahlin <cdahlin@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2009-07-14fs/Kconfig: move nilfs2 outRyusuke Konishi
fs/Kconfig file was split into individual fs/*/Kconfig files before nilfs was merged. I've found the current config entry of nilfs is tainting the work. Sorry, I didn't notice. This fixes the violation. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Alexey Dobriyan <adobriyan@gmail.com>
2009-07-13Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: jbd2: fix race between write_metadata_buffer and get_write_access ext4: Fix ext4_mb_initialize_context() to initialize all fields ext4: fix null handler of ioctls in no journal mode ext4: Fix buffer head reference leak in no-journal mode ext4: Move __ext4_journalled_writepage() to avoid forward declaration ext4: Fix mmap/truncate race when blocksize < pagesize && !nodellaoc ext4: Fix mmap/truncate race when blocksize < pagesize && delayed allocation ext4: Don't look at buffer_heads outside i_size. ext4: Fix goal inum check in the inode allocator ext4: fix no journal corruption with locale-gen ext4: Calculate required journal credits for inserting an extent properly ext4: Fix truncation of symlinks after failed write jbd2: Fix a race between checkpointing code and journal_get_write_access() ext4: Use rcu_barrier() on module unload. ext4: naturally align struct ext4_allocation_request ext4: mark several more functions in mballoc.c as noinline ext4: Fix potential reclaim deadlock when truncating partial block jbd2: Remove GFP_ATOMIC kmalloc from inside spinlock critical region ext4: Fix type warning on 64-bit platforms in tracing events header
2009-07-13jbd2: fix race between write_metadata_buffer and get_write_accessdingdinghua
The function jbd2_journal_write_metadata_buffer() calls jbd_unlock_bh_state(bh_in) too early; this could potentially allow another thread to call get_write_access on the buffer head, modify the data, and dirty it, and allowing the wrong data to be written into the journal. Fortunately, if we lose this race, the only time this will actually cause filesystem corruption is if there is a system crash or other unclean shutdown of the system before the next commit can take place. Signed-off-by: dingdinghua <dingdinghua85@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2009-07-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: wm97xx_batery: replace driver_data with dev_get_drvdata() omap: video: remove direct access of driver_data Sound: remove direct access of driver_data driver model: fix show/store prototypes in doc. Firmware: firmware_class, fix lock imbalance Driver Core: remove BUS_ID_SIZE sparc: remove driver-core BUS_ID_SIZE partitions: fix broken uevent_suppress conversion devres: WARN() and return, don't crash on device_del() of uninitialized device
2009-07-13ext4: Fix ext4_mb_initialize_context() to initialize all fieldsTheodore Ts'o
Pavel Roskin pointed out that kmemcheck indicated that ext4_mb_store_history() was accessing uninitialized values of ac->ac_tail and ac->ac_buddy leading to garbage in the mballoc history. Fix this by initializing the entire structure to all zeros first. Also, two fields were getting doubly initialized by the caller of ext4_mb_initialize_context, so remove them for efficiency's sake. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2009-07-13ext4: fix null handler of ioctls in no journal modePeng Tao
The EXT4_IOC_GROUP_ADD and EXT4_IOC_GROUP_EXTEND ioctls should not flush the journal in no_journal mode. Otherwise, running resize2fs on a mounted no_journal partition triggers the following error messages: BUG: unable to handle kernel NULL pointer dereference at 00000014 IP: [<c039d282>] _spin_lock+0x8/0x19 *pde = 00000000 Oops: 0002 [#1] SMP Signed-off-by: Peng Tao <bergwolf@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2009-07-13ext4: Fix buffer head reference leak in no-journal modeCurt Wohlgemuth
We found a problem with buffer head reference leaks when using an ext4 partition without a journal. In particular, calls to ext4_forget() would not to a brelse() on the input buffer head, which will cause pages they belong to to not be reclaimable. Further investigation showed that all places where ext4_journal_forget() and ext4_journal_revoke() are called are subject to the same problem. The patch below changes __ext4_journal_forget/__ext4_journal_revoke to do an explicit release of the buffer head when the journal handle isn't valid. Signed-off-by: Curt Wohlgemuth <curtw@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2009-07-13tracing/events: Move TRACE_SYSTEM outside of include guardLi Zefan
If TRACE_INCLDUE_FILE is defined, <trace/events/TRACE_INCLUDE_FILE.h> will be included and compiled, otherwise it will be <trace/events/TRACE_SYSTEM.h> So TRACE_SYSTEM should be defined outside of #if proctection, just like TRACE_INCLUDE_FILE. Imaging this scenario: #include <trace/events/foo.h> -> TRACE_SYSTEM == foo ... #include <trace/events/bar.h> -> TRACE_SYSTEM == bar ... #define CREATE_TRACE_POINTS #include <trace/events/foo.h> -> TRACE_SYSTEM == bar !!! and then bar.h will be included and compiled. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <4A5A9CF1.2010007@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-12genetlink: make netns awareJohannes Berg
This makes generic netlink network namespace aware. No generic netlink families except for the controller family are made namespace aware, they need to be checked one by one and then set the family->netnsok member to true. A new function genlmsg_multicast_netns() is introduced to allow sending a multicast message in a given namespace, for example when it applies to an object that lives in that namespace, a new function genlmsg_multicast_allns() to send a message to all network namespaces (for objects that do not have an associated netns). The function genlmsg_multicast() is changed to multicast the message in just init_net, which is currently correct for all generic netlink families since they only work in init_net right now. Some will later want to work in all net namespaces because they do not care about the netns at all -- those will have to be converted to use one of the new functions genlmsg_multicast_allns() or genlmsg_multicast_netns() whenever they are made netns aware in some way. After this patch families can easily decide whether or not they should be available in all net namespaces. Many genl families us it for objects not related to networking and should therefore be available in all namespaces, but that will have to be done on a per family basis. Note that this doesn't touch on the checkpoint/restart problem where network namespaces could be used, genl families and multicast groups are numbered globally and I see no easy way of changing that, especially since it must be possible to multicast to all network namespaces for those families that do not care about netns. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-12partitions: fix broken uevent_suppress conversionHeiko Carstens
git commit f67f129e "Driver core: implement uevent suppress in kobject" contains this chunk for fs/partitions/check.c: /* suppress uevent if the disk supresses it */ - if (!ddev->uevent_suppress) + if (!dev_get_uevent_suppress(pdev)) kobject_uevent(&pdev->kobj, KOBJ_ADD); However that should have been - if (!ddev->uevent_suppress) + if (!dev_get_uevent_suppress(ddev)) Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Ming Lei <tom.leiming@gmail.com> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-07-12AFS: Fix compilation warningArtem Bityutskiy
Fix the following warning: fs/afs/dir.c: In function 'afs_d_revalidate': fs/afs/dir.c:567: warning: 'fid.vnode' may be used uninitialized in this function fs/afs/dir.c:567: warning: 'fid.unique' may be used uninitialized in this function by marking the 'fid' variable as an uninitialized_var. The problem is that gcc doesn't always manage to work out that fid is always set on the path through the function that uses it. Cc: linux-afs@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-12headers: smp_lock.h reduxAlexey Dobriyan
* Remove smp_lock.h from files which don't need it (including some headers!) * Add smp_lock.h to files which do need it * Make smp_lock.h include conditional in hardirq.h It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT This will make hardirq.h inclusion cheaper for every PREEMPT=n config (which includes allmodconfig/allyesconfig, BTW) Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-11Revert "fuse: Fix build error" as unnecessaryLinus Torvalds
This reverts commit 097041e576ee3a50d92dd643ee8ca65bf6a62e21. Trond had a better fix, which is the parent of this one ("Fix compile error due to congestion_wait() changes") Requested-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-10isofs: fix Joliet regressionBartlomiej Zolnierkiewicz
commit 5404ac8e4418ab3d254950ee4f9bcafc1da20b4a ("isofs: cleanup mount option processing") missed conversion of joliet option flag resulting in non-working Joliet support. CC: walt <w41ter@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-10Merge branch 'linux-next' of git://git.infradead.org/ubifs-2.6Linus Torvalds
* 'linux-next' of git://git.infradead.org/ubifs-2.6: UBIFS: fix corruption dump UBIFS: clean up free space checking UBIFS: small amendments in the LEB scanning code UBIFS: dump a little more in case of corruptions MAINTAINERS: update ahunter's e-mail address UBIFS: allow more than one volume to be mounted UBIFS: fix assertion warning UBIFS: minor spelling and grammar fixes UBIFS: fix 64-bit divisions in debug print UBIFS: few spelling fixes UBIFS: set write-buffer timout to 3-5 seconds UBIFS: slightly optimize write-buffer timer usage UBIFS: improve debugging messaged UBIFS: fix integer overflow warning
2009-07-10Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osdLinus Torvalds
* 'for-linus' of git://git.open-osd.org/linux-open-osd: osdblk: Adjust queue limits to lower device's limits osdblk: a Linux block device for OSD objects MAINTAINERS: Add osd maintained files (F:) exofs: Avoid using file_fsync() exofs: Remove IBM copyrights exofs: Fix bio leak in error handling path (sync read)
2009-07-10fuse: Fix build errorLarry Finger
When building v2.6.31-rc2-344-g69ca06c, the following build errors are found due to missing includes: CC [M] fs/fuse/dev.o fs/fuse/dev.c: In function ‘request_end’: fs/fuse/dev.c:289: error: ‘BLK_RW_SYNC’ undeclared (first use in this function) ... fs/nfs/write.c: In function ‘nfs_set_page_writeback’: fs/nfs/write.c:207: error: ‘BLK_RW_ASYNC’ undeclared (first use in this function) Signed-off-by: Larry Finger@lwfinger.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-10Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: cfq-iosched: reset oom_cfqq in cfq_set_request() block: fix sg SG_DXFER_TO_FROM_DEV regression block: call blk_scsi_ioctl_init() Fix congestion_wait() sync/async vs read/write confusion
2009-07-10Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2: nilfs2: fix disorder in cp count on error during deleting checkpoints nilfs2: fix lockdep warning between regular file and inode file nilfs2: fix incorrect KERN_CRIT messages in case of write failures nilfs2: fix hang problem of log writer which occurs after write failures nilfs2: remove unlikely directive causing mis-conversion of error code
2009-07-10block: fix sg SG_DXFER_TO_FROM_DEV regressionFUJITA Tomonori
I overlooked SG_DXFER_TO_FROM_DEV support when I converted sg to use the block layer mapping API (2.6.28). Douglas Gilbert explained SG_DXFER_TO_FROM_DEV: http://www.spinics.net/lists/linux-scsi/msg37135.html = The semantics of SG_DXFER_TO_FROM_DEV were: - copy user space buffer to kernel (LLD) buffer - do SCSI command which is assumed to be of the DATA_IN (data from device) variety. This would overwrite some or all of the kernel buffer - copy kernel (LLD) buffer back to the user space. The idea was to detect short reads by filling the original user space buffer with some marker bytes ("0xec" it would seem in this report). The "resid" value is a better way of detecting short reads but that was only added this century and requires co-operation from the LLD. = This patch changes the block layer mapping API to support this semantics. This simply adds another field to struct rq_map_data and enables __bio_copy_iov() to copy data from user space even with READ requests. It's better to add the flags field and kills null_mapped and the new from_user fields in struct rq_map_data but that approach makes it difficult to send this patch to stable trees because st and osst drivers use struct rq_map_data (they were converted to use the block layer in 2.6.29 and 2.6.30). Well, I should clean up the block layer mapping API. zhou sf reported this regiression and tested this patch: http://www.spinics.net/lists/linux-scsi/msg37128.html http://www.spinics.net/lists/linux-scsi/msg37168.html Reported-by: zhou sf <sxzzsf@gmail.com> Tested-by: zhou sf <sxzzsf@gmail.com> Cc: stable@kernel.org Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-07-10Fix congestion_wait() sync/async vs read/write confusionJens Axboe
Commit 1faa16d22877f4839bd433547d770c676d1d964c accidentally broke the bdi congestion wait queue logic, causing us to wait on congestion for WRITE (== 1) when we really wanted BLK_RW_ASYNC (== 0) instead. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-07-10[CIFS] Distinguish posix opens and mkdirs from legacy mkdirs in statsSteve French
Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-07-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: cifs: when ATTR_READONLY is set, only clear write bits on non-directories cifs: remove cifsInodeInfo->inUse counter cifs: convert cifs_get_inode_info and non-posix readdir to use cifs_iget [CIFS] update cifs version number cifs: add and use CIFSSMBUnixSetFileInfo for setattr calls cifs: make a separate function for filling out FILE_UNIX_BASIC_INFO cifs: rename CIFSSMBUnixSetInfo to CIFSSMBUnixSetPathInfo cifs: add pid of initiating process to spnego upcall info cifs: fix regression with O_EXCL creates and optimize away lookup cifs: add new cifs_iget function and convert unix codepath to use it
2009-07-09cifs: when ATTR_READONLY is set, only clear write bits on non-directoriesJeff Layton
cifs: when ATTR_READONLY is set, only clear write bits on non-directories On windows servers, ATTR_READONLY apparently either has no meaning or serves as some sort of queue to certain applications for unrelated behavior. This MS kbase article has details: http://support.microsoft.com/kb/326549/ Don't clear the write bits directory mode when ATTR_READONLY is set. Reported-by: pouchat@peewiki.net Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-07-09cifs: remove cifsInodeInfo->inUse counterJeff Layton
cifs: remove cifsInodeInfo->inUse counter It was purported to be a refcounter of some sort, but was never used that way. It never served any purpose that wasn't served equally well by the I_NEW flag. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-07-09cifs: convert cifs_get_inode_info and non-posix readdir to use cifs_igetJeff Layton
cifs: convert cifs_get_inode_info and non-posix readdir to use cifs_iget Rather than allocating an inode and filling it out, have cifs_get_inode_info fill out a cifs_fattr and call cifs_iget. This means a pretty hefty reorganization of cifs_get_inode_info. For the readdir codepath, add a couple of new functions for filling out cifs_fattr's from different FindFile response infolevels. Finally, remove cifs_new_inode since there are no more callers. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-07-09[CIFS] update cifs version numberSteve French
Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-07-09cifs: add and use CIFSSMBUnixSetFileInfo for setattr callsJeff Layton
cifs: add and use CIFSSMBUnixSetFileInfo for setattr calls When there's an open filehandle, SET_FILE_INFO is apparently preferred over SET_PATH_INFO. Add a new variant that sets a FILE_UNIX_INFO_BASIC infolevel via SET_FILE_INFO and switch cifs_setattr_unix to use the new call when there's an open filehandle available. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-07-09cifs: make a separate function for filling out FILE_UNIX_BASIC_INFOJeff Layton
cifs: make a separate function for filling out FILE_UNIX_BASIC_INFO The SET_FILE_INFO variant will need to do the same thing here. Break this code out into a separate function that both variants can call. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-07-09cifs: rename CIFSSMBUnixSetInfo to CIFSSMBUnixSetPathInfoJeff Layton
cifs: rename CIFSSMBUnixSetInfo to CIFSSMBUnixSetPathInfo ...in preparation of adding a SET_FILE_INFO variant. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-07-09cifs: add pid of initiating process to spnego upcall infoJeff Layton
cifs: add pid of initiating process to spnego upcall info This will allow the upcall to poke in /proc/<pid>/environ and get the value of the $KRB5CCNAME env var for the process. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-07-09UBIFS: fix corruption dumpArtem Bityutskiy
In the 'ubifs_recover_leb()' function, when we find corrupted empty space, we dump 8K starting from the offset where the last node ends. This is OK if the corrupted empty space is somewhere near that offset. But if the corruption is far at the end of the LEB, we will dump all 0xFF bytes and complitely ignore the interesting data. This is observed on a PPC ("kilauea") with NOR flash. This patch changes the behavior and teaches UBIFS to print only interesting data. I.e., now we find where corruption starts and start dumping from that offset. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Reviewed-by: Adrian Hunter <Adrian.Hunter@nokia.com>
2009-07-09UBIFS: clean up free space checkingArtem Bityutskiy
recovery.c has 'is_empty()' helper and it is better to use this helper instead of re-implementing it in several places. This patch does this and removes some amount of unneeded code. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Reviewed-by: Adrian Hunter <Adrian.Hunter@nokia.com>
2009-07-09UBIFS: small amendments in the LEB scanning codeArtem Bityutskiy
This patch fixes few minor things I've spotted while going through code: 1. Better document return codes 2. If 'ubifs_scan_a_node()' returns some thing we do not expect, treat this as an error. 3. Try to do recovery only when 'ubifs_scan()' returns %-EUCLEAN, not on any error. 4. If empty space starts at a non-aligned address, print a message. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Reviewed-by: Adrian Hunter <Adrian.Hunter@nokia.com>
2009-07-09UBIFS: dump a little more in case of corruptionsArtem Bityutskiy
In case of corruptions, dump 8192 bytes instead of 4096. The largest node is 4096+ bytes, so it is better to see a node boundary, which is not always possible when only 4096 bytes are printed. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Reviewed-by: Adrian Hunter <Adrian.Hunter@nokia.com>
2009-07-08cifs: fix regression with O_EXCL creates and optimize away lookupJeff Layton
cifs: fix regression with O_EXCL creates and optimize away lookup Signed-off-by: Jeff Layton <jlayton@redhat.com> Tested-by: Shirish Pargaonkar <shirishp@gmail.com> CC: Stable Kernel <stable@kernel.org> Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-07-08Remove multiple KERN_ prefixes from printk formatsJoe Perches
Commit 5fd29d6ccbc98884569d6f3105aeca70858b3e0f ("printk: clean up handling of log-levels and newlines") changed printk semantics. printk lines with multiple KERN_<level> prefixes are no longer emitted as before the patch. <level> is now included in the output on each additional use. Remove all uses of multiple KERN_<level>s in formats. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-08Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-quota-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-quota-2.6: quota: Fix possible deadlock during parallel quotaon and quotaoff
2009-07-08Free the memory allocated by memdup_user() in fs/sysfs/bin.cCatalin Marinas
Commit 1c8542c7bb replaced kmalloc() with memdup_user() in the write() function but also dropped the kfree(temp). The memdup_user() function allocates memory which is never freed. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Parag Warudkar <parag.warudkar@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-08headers: mnt_namespace.h reduxAlexey Dobriyan
Fix various silly problems wrt mnt_namespace.h: - exit_mnt_ns() isn't used, remove it - done that, sched.h and nsproxy.h inclusions aren't needed - mount.h inclusion was need for vfsmount_lock, but no longer - remove mnt_namespace.h inclusion from files which don't use anything from mnt_namespace.h Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-07quota: Fix possible deadlock during parallel quotaon and quotaoffJiaying Zhang
The following test script triggers a deadlock on ext2 filesystem: while true; do quotaon /dev/hda >&/dev/null; usleep $RANDOM; done & while true; do quotaoff /dev/hda >&/dev/null; usleep $RANDOM; done & I found there is a potential deadlock between quotaon and quotaoff (or quotasync). Basically, all of quotactl operations need to be protected by dqonoff_mutex. vfs_quota_off and vfs_quota_sync also call sb->s_op->quota_write that needs to grab the i_mutex of the quota file. But in vfs_quota_on_inode (called from quotaon operation), the current code tries to grab the i_mutex of the quota file first before getting quonoff_mutex. Reverse the order in which we take locks in vfs_quota_on_inode(). Jan Kara: Changed changelog to be more readable, made lockdep happy with I_MUTEX_QUOTA. Signed-off-by: Jiaying Zhang <jiayingz@google.com> Signed-off-by: Jan Kara <jack@suse.cz>
2009-07-06cred_guard_mutex: do not return -EINTR to user-spaceOleg Nesterov
do_execve() and ptrace_attach() return -EINTR if mutex_lock_interruptible(->cred_guard_mutex) fails. This is not right, change the code to return ERESTARTNOINTR. Perhaps we should also change proc_pid_attr_write(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: David Howells <dhowells@redhat.com> Acked-by: Roland McGrath <roland@redhat.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-06sys_sync(): fix 16% performance regression in ffsb create_4k testZhang, Yanmin
I run many ffsb test cases on JBODs (typically 13/12 disks). Comparing with kernel 2.6.30, 2.6.31-rc1 has about 16% regression with ffsb_create_4k. The sub test case creates files continuously for 10 minitues and every file is 1MB. Bisect located below patch. 5cee5815d1564bbbd505fea86f4550f1efdb5cd0 is first bad commit commit 5cee5815d1564bbbd505fea86f4550f1efdb5cd0 Author: Jan Kara <jack@suse.cz> Date: Mon Apr 27 16:43:51 2009 +0200 vfs: Make sys_sync() use fsync_super() (version 4) It is unnecessarily fragile to have two places (fsync_super() and do_sync()) doing data integrity sync of the filesystem. Alter __fsync_super() to accommodate needs of both callers and use it. So after this patch __fsync_super() is the only place where we gather all the calls needed to properly send all data on a filesystem to disk. As a matter of fact, ffsb calls sys_sync in the end to make sure all data is flushed to disks and the flushing is counted into the result. vmstat shows ffsb is blocked when syncing for a long time. With 2.6.30, ffsb is blocked for a short time. I checked the patch and did experiments to recover the original methods. Eventually, the root cause is the patch deletes the calling to wakeup_pdflush when syncing, so only ffsb is blocked on disk I/O. wakeup_pdflush could ask pdflush to write back pages with ffsb at the same time. [akpm@linux-foundation.org: restore comment too] Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>