Age | Commit message (Collapse) | Author |
|
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
xfs: Fix error return for fallocate() on XFS
xfs: cleanup dmapi macros in the umount path
xfs: remove incorrect sparse annotation for xfs_iget_cache_miss
xfs: kill the STATIC_INLINE macro
xfs: uninline xfs_get_extsz_hint
xfs: rename xfs_attr_fetch to xfs_attr_get_int
xfs: simplify xfs_buf_get / xfs_buf_read interfaces
xfs: remove IO_ISAIO
xfs: Wrapped journal record corruption on read at recovery
xfs: cleanup data end I/O handlers
xfs: use WRITE_SYNC_PLUG for synchronous writeout
xfs: reset the i_iolock lock class in the reclaim path
xfs: I/O completion handlers must use NOFS allocations
xfs: fix mmap_sem/iolock inversion in xfs_free_eofblocks
xfs: simplify inode teardown
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (27 commits)
Driver core: fix race in dev_driver_string
Driver Core: Early platform driver buffer
sysfs: sysfs_setattr remove unnecessary permission check.
sysfs: Factor out sysfs_rename from sysfs_rename_dir and sysfs_move_dir
sysfs: Propagate renames to the vfs on demand
sysfs: Gut sysfs_addrm_start and sysfs_addrm_finish
sysfs: In sysfs_chmod_file lazily propagate the mode change.
sysfs: Implement sysfs_getattr & sysfs_permission
sysfs: Nicely indent sysfs_symlink_inode_operations
sysfs: Update s_iattr on link and unlink.
sysfs: Fix locking and factor out sysfs_sd_setattr
sysfs: Simplify iattr time assignments
sysfs: Simplify sysfs_chmod_file semantics
sysfs: Use dentry_ops instead of directly playing with the dcache
sysfs: Rename sysfs_d_iput to sysfs_dentry_iput
sysfs: Update sysfs_setxattr so it updates secdata under the sysfs_mutex
debugfs: fix create mutex racy fops and private data
Driver core: Don't remove kobjects in device_shutdown.
firmware_class: make request_firmware_nowait more useful
Driver-Core: devtmpfs - set root directory mode to 0755
...
|
|
Noticed that through glibc fallocate would return 28 rather than -1
and errno = 28 for ENOSPC. The xfs routines uses XFS_ERROR format
positive return error codes while the syscalls use negative return
codes. Fixup the two cases in xfs_vn_fallocate syscall to convert to
negative.
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
Stop the flag saving as we never mangle those in the unmount path, and
hide all the weird arguents to the dmapi code inside the
XFS_SEND_PREUNMOUNT / XFS_SEND_UNMOUNT macros.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
xfs_iget_cache_miss does not get called with the pag_ici_lock held, so
the __releases annotation is incorrect.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
Remove our own STATIC_INLINE macro. For small function inside
implementation files just use STATIC and let gcc inline it, and for
those in headers do the normal static inline - they are all small
enough to be inlined for debug builds, too.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
This function is too large to efficiently be inlined.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
Using a totally different name for the low-level get operation does
not fit the _int convention used in the rest of the attr code, so
rename it.
While we're at it also fix the prototype to use the normal convention
and mark it static as it's never used outside of xfs_attr.c.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
Currently the low-level buffer cache interfaces are highly confusing
as we have a _flags variant of each that does actually respect the
flags, and one without _flags which has a flags argument that gets
ignored and overriden with a default set. Given that very few places
use the default arguments get rid of the duplication and convert all
callers to pass the flags explicitly. Also remove the now confusing
_flags postfix.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
We set the IO_ISAIO flag for all read/write I/O since early Linux
2.6.x. Remove it as it has lost it's purpose long ago.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
Summary of problem:
If a journal record wraps at the physical end of the journal, it has to be
read in two parts in xlog_do_recovery_pass(): a read at the physical end and a
read at the physical beginning. If xlog_bread() has to re-align the first
read, the second read request does not take that re-alignment into account.
If the first read was re-aligned, the second read over-writes the end of the
data from the first read, effectively corrupting it. This can happen either
when reading the record header or reading the record data.
The first sanity check in xlog_recover_process_data() is to check for a valid
clientid, so that is the error reported.
Summary of fix:
If there was a first read at the physical end, XFS_BUF_PTR() returns where the
data was requested to begin. Conversely, because it is the result of
xlog_align(), offset indicates where the requested data for the first read
actually begins - whether or not xlog_bread() has re-aligned it.
Using offset as the base for the calculation of where to place the second read
data ensures that it will be correctly placed immediately following the data
from the first read instead of sometimes over-writing the end of it.
The attached patch has resolved the reported problem of occasional inability
to recover the journal (reporting "bad clientid").
Signed-off-by: Andy Poling <andy@realbig.com>
Reviewed-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
Currently we have different end I/O handlers for read vs the different
types of write I/O. But they are all very similar so we could just
use one with a few conditionals and reduce code size a lot.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
The VM and I/O schedulers now expect us to use WRITE_SYNC_PLUG for
synchronous writeout. Right now I can't see any changes in performance
numbers with this, but we're getting some beating for not using it,
and the knowledge definitely could help the block code to make better
decisions.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
The iolock is used for protecting reads, writes and block truncates
against each other. We have two classes of callers, the first one is
induced by a file operation and requires a reference to the inode be
held and not dropped after the operation is done:
- xfs_vm_vmap, xfs_vn_fallocate, xfs_read, xfs_write, xfs_splice_read,
xfs_splice_write and xfs_setattr are all implementations of VFS
methods that require a live inode
- xfs_getbmap and xfs_swap_extents are ioctl subcommand for which the
same is true
- xfs_truncate_file is only called on quota inodes just returned from
xfs_iget
- xfs_sync_inode_data does the lock just after an igrab()
- xfs_filestream_associate and xfs_filestream_new_ag take the iolock
on the parent inode of an inode which by VFS rules must be referenced
And we have various calls to truncate blocks past EOF or the whole
file when dropping the last reference to an inode. Unfortunately
lockdep complains when we do memory allocations that can recurse into
the filesystem in the first class because the second class happens to
take the same lock. To avoid this re-init the iolock in the beginning
of xfs_fs_clear_inode to get a new lock class.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
When completing I/O requests we must not allow the memory allocator to
recurse into the filesystem, as we might deadlock on waiting for the
I/O completion otherwise. The only thing currently allocating normal
GFP_KERNEL memory is the allocation of the transaction structure for
the unwritten extent conversion. Add a memflags argument to
_xfs_trans_alloc to allow controlling the allocator behaviour.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Thomas Neumann <tneumann@users.sourceforge.net>
Tested-by: Thomas Neumann <tneumann@users.sourceforge.net>
Reviewed-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
When xfs_free_eofblocks is called from ->release the VM might already
hold the mmap_sem, but in the write path we take the iolock before
taking the mmap_sem in the generic write code.
Switch xfs_free_eofblocks to only trylock the iolock if called from
->release and skip trimming the prellocated blocks in that case.
We'll still free them later on the final iput.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
Currently the reclaim code for the case where we don't reclaim the
final reclaim is overly complicated. We know that the inode is clean
but instead of just directly reclaiming the clean inode we go through
the whole process of marking the inode reclaimable just to directly
reclaim it from the calling context. Besides being overly complicated
this introduces a race where iget could recycle an inode between
marked reclaimable and actually being reclaimed leading to panics.
This patch gets rid of the existing reclaim path, and replaces it with
a simple call to xfs_ireclaim if the inode was clean. While we're at
it we also use the slightly more lax xfs_inode_clean check we'd use
later to determine if we need to flush the inode here.
Finally get rid of xfs_reclaim function and place the remaining small
bits of reclaim code directly into xfs_fs_destroy_inode.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Patrick Schreurs <patrick@news-service.com>
Reported-by: Tommy van Leeuwen <tommy@news-service.com>
Tested-by: Patrick Schreurs <patrick@news-service.com>
Reviewed-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2: (49 commits)
nilfs2: separate wait function from nilfs_segctor_write
nilfs2: add iterator for segment buffers
nilfs2: hide nilfs_write_info struct in segment buffer code
nilfs2: relocate io status variables to segment buffer
nilfs2: do not return io error for bio allocation failure
nilfs2: use list_splice_tail or list_splice_tail_init
nilfs2: replace mark_inode_dirty as nilfs_mark_inode_dirty
nilfs2: delete mark_inode_dirty in nilfs_delete_entry
nilfs2: delete mark_inode_dirty in nilfs_commit_chunk
nilfs2: change return type of nilfs_commit_chunk
nilfs2: split nilfs_unlink as nilfs_do_unlink and nilfs_unlink
nilfs2: delete redundant mark_inode_dirty
nilfs2: expand inode_inc_link_count and inode_dec_link_count
nilfs2: delete mark_inode_dirty from nilfs_set_link
nilfs2: delete mark_inode_dirty in nilfs_new_inode
nilfs2: add norecovery mount option
nilfs2: add helper to get if volume is in a valid state
nilfs2: move recovery completion into load_nilfs function
nilfs2: apply readahead for recovery on mount
nilfs2: clean up get/put function of a segment usage
...
|
|
inode_change_ok already clears the SGID bit when necessary
so there is no reason for sysfs_setattr to carry code to do
the same, and it is good to kill the extra copy because when
I moved the code last in certain corner cases the code will
look at the wrong gid.
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
These two functions do 90% of the same work and it doesn't significantly
obfuscate the function to allow both the parent dir and the name to change
at the same time. So merge them together to simplify maintenance, and
increase testing.
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
By teaching sysfs_revalidate to hide a dentry for
a sysfs_dirent if the sysfs_dirent has been renamed,
and by teaching sysfs_lookup to return the original
dentry if the sysfs dirent has been renamed. I can
show the results of renames correctly without having to
update the dcache during the directory rename.
This massively simplifies the rename logic allowing a lot
of weird sysfs special cases to be removed along with
a lot of now unnecesary helper code.
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
With lazy inode updates and dentry operations bringing everything
into sync on demand there is no longer any need to immediately
update the vfs or grab i_mutex to protect those updates as we
make changes to sysfs.
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Now that sysfs_getattr and sysfs_permission refresh the vfs
inode there is no need to immediatly push the mode change
into the vfs cache. Reducing the amount of work needed and
simplifying the locking.
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
With the implementation of sysfs_getattr and sysfs_permission
sysfs becomes able to lazily propogate inode attribute changes
from the sysfs_dirents to the vfs inodes. This paves the way
for deleting significant chunks of now unnecessary code.
While doing this we did not reference sysfs_setattr from
sysfs_symlink_inode_operations so I added along with
sysfs_getattr and sysfs_permission.
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Lining up the functions in sysfs_symlink_inode_operations
follows the pattern in the rest of sysfs and makes things
slightly more readable.
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Currently sysfs updates the timestamps on the vfs directory
inode when we create or remove a directory entry but doesn't
update the cached copy on the sysfs_dirent, fix that oversight.
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Cleanly separate the work that is specific to setting the
attributes of a sysfs_dirent from what is needed to update
the attributes of a vfs inode.
Additionally grab the sysfs_mutex to keep any nasties from
surprising us when updating the sysfs_dirent.
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
The granularity of sysfs time when we keep it is 1 ns. Which
when passed to timestamp_trunc results in a nop. So remove
the unnecessary function call making sysfs_setattr slightly
easier to read.
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Currently every caller of sysfs_chmod_file happens at either
file creation time to set a non-default mode or in response
to a specific user requested space change in policy. Making
timestamps of when the chmod happens and notification of
a file changing mode uninteresting.
Remove the unnecessary time stamp and filesystem change
notification, and removes the last of the explicit inotify
and donitfy support from sysfs.
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Calling d_drop unconditionally when a sysfs_dirent is deleted has
the potential to leak mounts, so instead implement dentry delete
and revalidate operations that cause sysfs dentries to be removed
at the appropriate time.
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Using dentry instead of d in the function name is what
several other filesystems are doing and it seems to be
a more readable convention.
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
The sysfs_mutex is required to ensure updates are and will remain
atomic with respect to other inode iattr updates, that do not happen
through the filesystem.
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Setting fops and private data outside of the mutex at debugfs file
creation introduces a race where the files can be opened with the wrong
file operations and private data. It is easy to trigger with a process
waiting on file creation notification.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Acked-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm:
dlm: always use GFP_NOFS
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (47 commits)
ext4: Fix potential fiemap deadlock (mmap_sem vs. i_data_sem)
ext4: Do not override ext2 or ext3 if built they are built as modules
jbd2: Export jbd2_log_start_commit to fix ext4 build
ext4: Fix insufficient checks in EXT4_IOC_MOVE_EXT
ext4: Wait for proper transaction commit on fsync
ext4: fix incorrect block reservation on quota transfer.
ext4: quota macros cleanup
ext4: ext4_get_reserved_space() must return bytes instead of blocks
ext4: remove blocks from inode prealloc list on failure
ext4: wait for log to commit when umounting
ext4: Avoid data / filesystem corruption when write fails to copy data
ext4: Use ext4 file system driver for ext2/ext3 file system mounts
ext4: Return the PTR_ERR of the correct pointer in setup_new_group_blocks()
jbd2: Add ENOMEM checking in and for jbd2_journal_write_metadata_buffer()
ext4: remove unused parameter wbc from __ext4_journalled_writepage()
ext4: remove encountered_congestion trace
ext4: move_extent_per_page() cleanup
ext4: initialize moved_len before calling ext4_move_extents()
ext4: Fix double-free of blocks with EXT4_IOC_MOVE_EXT
ext4: use ext4_data_block_valid() in ext4_free_blocks()
...
|
|
* 'for-linus' of git://git.open-osd.org/linux-open-osd:
exofs: Multi-device mirror support
exofs: Move all operations to an io_engine
exofs: move osd.c to ios.c
exofs: statfs blocks is sectors not FS blocks
exofs: Prints on mount and unmout
exofs: refactor exofs_i_info initialization into common helper
exofs: dbg-print less
exofs: More sane debug print
trivial: some small fixes in exofs documentation
|
|
* git://git.infradead.org/ubifs-2.6:
UBIFS: fix return code in check_leaf
UBI: flush wl before clearing update marker
MAINTAINERS: change e-mail of Artem Bityutskiy
UBIFS: remove manual O_SYNC handling
UBIFS: support mounting of UBI volume character devices
UBI: Add ubi_open_volume_path
|
|
This patch changes on-disk format, it is accompanied with a parallel
patch to mkfs.exofs that enables multi-device capabilities.
After this patch, old exofs will refuse to mount a new formatted FS and
new exofs will refuse an old format. This is done by moving the magic
field offset inside the FSCB. A new FSCB *version* field was added. In
the future, exofs will refuse to mount unmatched FSCB version. To
up-grade or down-grade an exofs one must use mkfs.exofs --upgrade option
before mounting.
Introduced, a new object that contains a *device-table*. This object
contains the default *data-map* and a linear array of devices
information, which identifies the devices used in the filesystem. This
object is only written to offline by mkfs.exofs. This is why it is kept
separate from the FSCB, since the later is written to while mounted.
Same partition number, same object number is used on all devices only
the device varies.
* define the new format, then load the device table on mount time make
sure every thing is supported.
* Change I/O engine to now support Mirror IO, .i.e write same data
to multiple devices, read from a random device to spread the
read-load from multiple clients (TODO: stripe read)
Implementation notes:
A few points introduced in previous patch should be mentioned here:
* Special care was made so absolutlly all operation that have any chance
of failing are done before any osd-request is executed. This is to
minimize the need for a data consistency recovery, to only real IO
errors.
* Each IO state has a kref. It starts at 1, any osd-request executed
will increment the kref, finally when all are executed the first ref
is dropped. At IO-done, each request completion decrements the kref,
the last one to return executes the internal _last_io() routine.
_last_io() will call the registered io_state_done. On sync mode a
caller does not supply a done method, indicating a synchronous
request, the caller is put to sleep and a special io_state_done is
registered that will awaken the caller. Though also in sync mode all
operations are executed in parallel.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
|
|
In anticipation for multi-device operations, we separate osd operations
into an abstract I/O API. Currently only one device is used but later
when adding more devices, we will drive all devices in parallel according
to a "data_map" that describes how data is arranged on multiple devices.
The file system level operates, like before, as if there is one object
(inode-number) and an i_size. The io engine will split this to the same
object-number but on multiple device.
At first we introduce Mirror (raid 1) layout. But at the final outcome
we intend to fully implement the pNFS-Objects data-map, including
raid 0,4,5,6 over mirrored devices, over multiple device-groups. And
more. See: http://tools.ietf.org/html/draft-ietf-nfsv4-pnfs-obj-12
* Define an io_state based API for accessing osd storage devices
in an abstract way.
Usage:
First a caller allocates an io state with:
exofs_get_io_state(struct exofs_sb_info *sbi,
struct exofs_io_state** ios);
Then calles one of:
exofs_sbi_create(struct exofs_io_state *ios);
exofs_sbi_remove(struct exofs_io_state *ios);
exofs_sbi_write(struct exofs_io_state *ios);
exofs_sbi_read(struct exofs_io_state *ios);
exofs_oi_truncate(struct exofs_i_info *oi, u64 new_len);
And when done
exofs_put_io_state(struct exofs_io_state *ios);
* Convert all source files to use this new API
* Convert from bio_alloc to bio_kmalloc
* In io engine we make use of the now fixed osd_req_decode_sense
There are no functional changes or on disk additions after this patch.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
|
|
If I do a "git mv" together with a massive code change
and commit in one patch, git looses the rename and
records a delete/new instead. This is bad because I want
a rename recorded so later rebased/cherry-picked patches
to the old name will work. Also the --follow is lost.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
|
|
Even though exofs has a 4k block size, statfs blocks
is in sectors (512 bytes).
Also if target returns 0 for capacity then make it
ULLONG_MAX. df does not like zero-size filesystems
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
|
|
It is important to print in the logs when a filesystem was
mounted and eventually unmounted.
Print the osd-device's osd_name and pid the FS was
mounted/unmounted on.
TODO: How to also print the namespace path the filesystem was
mounted on?
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
|
|
There are two places that initialize inodes: exofs_iget() and
exofs_new_inode()
As more members of exofs_i_info that need initialization are
added this code will grow. (soon)
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
|
|
Iner-loops printing is converted to EXOFS_DBG2 which is #defined
to nothing.
It is now almost bareable to just leave debug-on. Every operation
is printed once, with most relevant info (I hope).
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
|
|
debug prints should be somewhat useful without actually
reading the source code
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (42 commits)
tree-wide: fix misspelling of "definition" in comments
reiserfs: fix misspelling of "journaled"
doc: Fix a typo in slub.txt.
inotify: remove superfluous return code check
hdlc: spelling fix in find_pvc() comment
doc: fix regulator docs cut-and-pasteism
mtd: Fix comment in Kconfig
doc: Fix IRQ chip docs
tree-wide: fix assorted typos all over the place
drivers/ata/libata-sff.c: comment spelling fixes
fix typos/grammos in Documentation/edac.txt
sysctl: add missing comments
fs/debugfs/inode.c: fix comment typos
sgivwfb: Make use of ARRAY_SIZE.
sky2: fix sky2_link_down copy/paste comment error
tree-wide: fix typos "couter" -> "counter"
tree-wide: fix typos "offest" -> "offset"
fix kerneldoc for set_irq_msi()
spidev: fix double "of of" in comment
comment typo fix: sybsystem -> subsystem
...
|
|
Fix the following potential circular locking dependency between
mm->mmap_sem and ei->i_data_sem:
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.32-04115-gec044c5 #37
-------------------------------------------------------
ureadahead/1855 is trying to acquire lock:
(&mm->mmap_sem){++++++}, at: [<ffffffff81107224>] might_fault+0x5c/0xac
but task is already holding lock:
(&ei->i_data_sem){++++..}, at: [<ffffffff811be1fd>] ext4_fiemap+0x11b/0x159
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&ei->i_data_sem){++++..}:
[<ffffffff81099bfa>] __lock_acquire+0xb67/0xd0f
[<ffffffff81099e7e>] lock_acquire+0xdc/0x102
[<ffffffff81516633>] down_read+0x51/0x84
[<ffffffff811a2414>] ext4_get_blocks+0x50/0x2a5
[<ffffffff811a3453>] ext4_get_block+0xab/0xef
[<ffffffff81154f39>] do_mpage_readpage+0x198/0x48d
[<ffffffff81155360>] mpage_readpages+0xd0/0x114
[<ffffffff811a104b>] ext4_readpages+0x1d/0x1f
[<ffffffff810f8644>] __do_page_cache_readahead+0x12f/0x1bc
[<ffffffff810f86f2>] ra_submit+0x21/0x25
[<ffffffff810f0cfd>] filemap_fault+0x19f/0x32c
[<ffffffff81107b97>] __do_fault+0x55/0x3a2
[<ffffffff81109db0>] handle_mm_fault+0x327/0x734
[<ffffffff8151aaa9>] do_page_fault+0x292/0x2aa
[<ffffffff81518205>] page_fault+0x25/0x30
[<ffffffff812a34d8>] clear_user+0x38/0x3c
[<ffffffff81167e16>] padzero+0x20/0x31
[<ffffffff81168b47>] load_elf_binary+0x8bc/0x17ed
[<ffffffff81130e95>] search_binary_handler+0xc2/0x259
[<ffffffff81166d64>] load_script+0x1b8/0x1cc
[<ffffffff81130e95>] search_binary_handler+0xc2/0x259
[<ffffffff8113255f>] do_execve+0x1ce/0x2cf
[<ffffffff81027494>] sys_execve+0x43/0x5a
[<ffffffff8102918a>] stub_execve+0x6a/0xc0
-> #0 (&mm->mmap_sem){++++++}:
[<ffffffff81099aa4>] __lock_acquire+0xa11/0xd0f
[<ffffffff81099e7e>] lock_acquire+0xdc/0x102
[<ffffffff81107251>] might_fault+0x89/0xac
[<ffffffff81139382>] fiemap_fill_next_extent+0x95/0xda
[<ffffffff811bcb43>] ext4_ext_fiemap_cb+0x138/0x157
[<ffffffff811be069>] ext4_ext_walk_space+0x178/0x1f1
[<ffffffff811be21e>] ext4_fiemap+0x13c/0x159
[<ffffffff811390e6>] do_vfs_ioctl+0x348/0x4d6
[<ffffffff811392ca>] sys_ioctl+0x56/0x79
[<ffffffff81028cb2>] system_call_fastpath+0x16/0x1b
other info that might help us debug this:
1 lock held by ureadahead/1855:
#0: (&ei->i_data_sem){++++..}, at: [<ffffffff811be1fd>] ext4_fiemap+0x11b/0x159
stack backtrace:
Pid: 1855, comm: ureadahead Not tainted 2.6.32-04115-gec044c5 #37
Call Trace:
[<ffffffff81098c70>] print_circular_bug+0xa8/0xb7
[<ffffffff81099aa4>] __lock_acquire+0xa11/0xd0f
[<ffffffff8102f229>] ? sched_clock+0x9/0xd
[<ffffffff81099e7e>] lock_acquire+0xdc/0x102
[<ffffffff81107224>] ? might_fault+0x5c/0xac
[<ffffffff81107251>] might_fault+0x89/0xac
[<ffffffff81107224>] ? might_fault+0x5c/0xac
[<ffffffff81124b44>] ? __kmalloc+0x13b/0x18c
[<ffffffff81139382>] fiemap_fill_next_extent+0x95/0xda
[<ffffffff811bcb43>] ext4_ext_fiemap_cb+0x138/0x157
[<ffffffff811bca0b>] ? ext4_ext_fiemap_cb+0x0/0x157
[<ffffffff811be069>] ext4_ext_walk_space+0x178/0x1f1
[<ffffffff811be21e>] ext4_fiemap+0x13c/0x159
[<ffffffff81107224>] ? might_fault+0x5c/0xac
[<ffffffff811390e6>] do_vfs_ioctl+0x348/0x4d6
[<ffffffff8129f6d0>] ? __up_read+0x8d/0x95
[<ffffffff81517fb5>] ? retint_swapgs+0x13/0x1b
[<ffffffff811392ca>] sys_ioctl+0x56/0x79
[<ffffffff81028cb2>] system_call_fastpath+0x16/0x1b
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
The CONFIG_EXT4_USE_FOR_EXT23 option must not try to take over the
ext2 or ext3 file systems if the those file system drivers are
configured to be built as mdoules.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
This fixes:
ERROR: "jbd2_log_start_commit" [fs/ext4/ext4.ko] undefined!
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|