aboutsummaryrefslogtreecommitdiff
path: root/drivers/block/ll_rw_blk.c
AgeCommit message (Collapse)Author
2005-11-01[BLOCK] Update read/write block io statistics at completion timeJens Axboe
Right now we do it at queueing time, which works alright for reads (since they are usually sync), but not for async writes since we can queue io a lot faster than we can complete it. This makes the vmstat output look extremely bursty. Signed-off-by: Jens Axboe <axboe@suse.de>
2005-10-28Merge branch 'elevator-switch' of git://brick.kernel.dk/data/git/linux-2.6-blockLinus Torvalds
Manual fixup for trivial "gfp_t" changes.
2005-10-28Merge branch 'generic-dispatch' of ↵Linus Torvalds
git://brick.kernel.dk/data/git/linux-2.6-block
2005-10-28Merge branch 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-blockLinus Torvalds
2005-10-28[PATCH] gfp_t: block layer coreAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-28[BLOCK] elevator switch fixes/cleanupJens Axboe
- 100msec sleep is a little excessive, lots of requests can complete in that timeframe. Use 10msec instead. - Rename QUEUE_FLAG_BYPASS to QUEUE_FLAG_ELVSWITCH to indicate what is going on. Signed-off-by: Jens Axboe <axboe@suse.de>
2005-10-28[BLOCK] Reimplement elevator switchTejun Heo
This patch reimplements elevator switch. This patch assumes generic dispatch queue patchset is applied. * Each request is tagged with REQ_ELVPRIV flag if it has its elevator private data set. * Requests which doesn't have REQ_ELVPRIV flag set never enter iosched. They are always directly back inserted to dispatch queue. Of course, elevator_put_req_fn is called only for requests which have its REQ_ELVPRIV set. * Request queue maintains the current number of requests which have its elevator data set (elevator_set_req_fn called) in q->rq->elvpriv. * If a request queue has QUEUE_FLAG_BYPASS set, elevator private data is not allocated for new requests. To switch to another iosched, we set QUEUE_FLAG_BYPASS and wait until elvpriv goes to zero; then, we attach the new iosched and clears QUEUE_FLAG_BYPASS. New implementation is much simpler and main code paths are less cluttered, IMHO. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jens Axboe <axboe@suse.de>
2005-10-28[PATCH] 01/05 Implement generic dispatch queueTejun Heo
Implements generic dispatch queue which can replace all dispatch queues implemented by each iosched. This reduces code duplication, eases enforcing semantics over dispatch queue, and simplifies specific ioscheds. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jens Axboe <axboe@suse.de>
2005-10-28Following the same idea, it occurs to me that we should only updateChen, Kenneth W
disk stat when "now" is different from disk->stamp. Otherwise, we are again needlessly adding zero to the stats. Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Signed-off-by: Jens Axboe <axboe@suse.de>
2005-10-28[patch] remove gendisk->stamp_idle fieldChen, Kenneth W
struct gendisk has these two fields: stamp, stamp_idle. Update to stamp_idle is always in sync with stamp and they are always the same. Therefore, it does not add any value in having two fields tracking same timestamp. Suggest to remove it. Also, we should only update gendisk stats with non-zero value. Advantage is that we don't have to needlessly calculate memory address, and then add zero to the content. Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Signed-off-by: Jens Axboe <axboe@suse.de>
2005-09-21[PATCH] remove blkdev_scsi_issue_flush_fn againChristoph Hellwig
This function was removed a while ago, but crept in again via a recent scsi merge. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-for-linus-2.6 Linus Torvalds
2005-09-07[PATCH] blk: Use blk_queue_xxx functions to set parametersStuart McLaren
Per-queue parameters should be updated using the appropriate blk_queue_xxx functions. Signed-off-by: Stuart McLaren <stuart.mclaren@hp.com> Cc: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-28fix mismerge in ll_rw_blk.cJames Bottomley
2005-08-05[PATCH] blk: fix tag shrinking (revive real_max_size)Tejun Heo
My patch in commit fa72b903f75e4f0f0b2c2feed093005167da4023 incorrectly removed blk_queue_tag->real_max_depth. The original resize implementation was incorrect in the following points. * actual allocation size of tag_index was shorter than real_max_size, but assumed to be of the same size, possibly causing memory access beyond the allocated area. * bits in tag_map between max_deptn and real_max_depth were initialized to 1's, making the tags permanently reserved. In an attempt to fix above two bugs, I had removed allocation optimization in init_tag_map and real_max_size. Tag map/index were allocated and freed immediately during resize. Unfortunately, I wasn't considering that tag map/index can be resized dynamically with tags beyond new_depth active. This led to accessing freed area after shrinking tags and led to the following bug reporting thread on linux-scsi. http://marc.theaimsgroup.com/?l=linux-scsi&m=112319898111885&w=2 To fix the problem, I've revived real_max_depth without allocation optimization in init_tag_map, and Andrew Vasquez confirmed that the problem was fixed. As Jens is not going to be available for a week, he asked me to make sure that this patch reaches you. http://marc.theaimsgroup.com/?l=linux-scsi&m=112325778530886&w=2 Also, a comment was added to make sure that real_max_size is needed for dynamic shrinking. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-29[PATCH] Fix get_request nastinessHugh Dickins
get_request is now expected to be holding on to queue_lock, with interrupts disabled, when it returns NULL; but one path forgot that, causing all kinds of nastiness under swap load - badness backtraces, strange failures, BUGs. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-28[PATCH] blk: light iocontext opsNick Piggin
get_io_context needlessly turned off interrupts and checked for racing io context creations. Both of which aren't needed, because the io context can only be created while in process context of the current process. Also, split the function in 2. A light version, current_io_context does not elevate the reference count specifically, but can be used when in process context, because the process holds a reference itself. Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-28[PATCH] blk: reduce lockingNick Piggin
Change around locking a bit for a result of 1-2 less spin lock unlock pairs in request submission paths. Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-28[PATCH] blk: __make_request efficiencyNick Piggin
In the case where the request is not able to be merged by the elevator, don't retake the lock and retry the merge mechanism after allocating a new request. Instead assume that the chance of a merge remains slim, and now that we've done most of the work allocating a request we may as well just go with it. Also be rid of the GFP_ATOMIC allocation: we've got working mempools for the block layer now, so let's save atomic memory for things like networking. Lastly, in get_request_wait, do an initial get_request call before going into the waitqueue. This is reported to help efficiency. Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-28[PATCH] ll_rw_blk: prevent huge request allocationsJens Axboe
Currently we cap request allocations at q->nr_requests, but we allow a batching io context to allocate up to 32 more (default setting). This can flood the queue with request allocations, with only a few batching processes. The real fix would be to limit the number of batchers, but as that isn't currently tracked, I suggest we just cap the maximum number of allocated requests to eg 50% over the limit. This was observed in real life, users typically see this as vmstat bo numbers going off the wall with seconds of no queueing afterwards. Behaviour this bursty is not beneficial. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27[PATCH] Update cfq io scheduler to time sliced designJens Axboe
This updates the CFQ io scheduler to the new time sliced design (cfq v3). It provides full process fairness, while giving excellent aggregate system throughput even for many competing processes. It supports io priorities, either inherited from the cpu nice value or set directly with the ioprio_get/set syscalls. The latter closely mimic set/getpriority. This import is based on my latest from -mm. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25[PATCH] ll_merge_requests_fn() cleanupNikita Danilov
ll_merge_requests_fn() assigns total_{phys,hw}_segments twice. Fix this and a typo. Signed-off-by: Nikita Danilov <nikita@clusterfs.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25[PATCH] drivers/block/ll_rw_blk.c: cleanupsAdrian Bunk
This patch contains the following cleanups: - make needlessly global code static - remove the following unused global functions: - blkdev_scsi_issue_flush_fn - __blk_attempt_remerge - remove the following unused EXPORT_SYMBOL's: - blk_phys_contig_segment - blk_hw_contig_segment - blkdev_scsi_issue_flush_fn - __blk_attempt_remerge Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23[PATCH] blk: unplug laterNick Piggin
get_request_wait needn't unplug the device immediately. Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23[PATCH] blk: branch hintsNick Piggin
Sprinkle around a few branch hints in the block layer. Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23[PATCH] blk: no memory barrierNick Piggin
This memory barrier is not needed because the waitqueue will only get waiters on it in the following situations: rq->count has exceeded the threshold - however all manipulations of ->count are performed under the runqueue lock, and so we will correctly pick up any waiter. Memory allocation for the request fails. In this case, there is no additional help provided by the memory barrier. We are guaranteed to eventually wake up waiters because the request allocation mempool guarantees that if the mem allocation for a request fails, there must be some requests in flight. They will wake up waiters when they are retired. Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23[PATCH] blk: cleanup generic tag support error messagesTejun Heo
Add KERN_ERR and __FUNCTION__ to generic tag error messages, and add a comment in blk_queue_end_tag() which explains the silent failure path. Signed-off-by: Tejun Heo <htejun@gmail.com> Acked-by: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23[PATCH] blk: remove BLK_TAGS_{PER_LONG|MASK}Tejun Heo
Replace BLK_TAGS_PER_LONG with BITS_PER_LONG and remove unused BLK_TAGS_MASK. Signed-off-by: Tejun Heo <htejun@gmail.com> Acked-by: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23[PATCH] blk: remove blk_queue_tag->real_max_depth optimizationTejun Heo
blk_queue_tag->real_max_depth was used to optimize out unnecessary allocations/frees on tag resize. However, the whole thing was very broken - tag_map was never allocated to real_max_depth resulting in access beyond the end of the map, bits in [max_depth..real_max_depth] were set when initializing a map and copied when resizing resulting in pre-occupied tags. As the gain of the optimization is very small, well, almost nill, remove the whole thing. Signed-off-by: Tejun Heo <htejun@gmail.com> Acked-by: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23[PATCH] blk: use find_first_zero_bit() in blk_queue_start_tag()Tejun Heo
blk_queue_start_tag() hand-coded searching for the first zero bit in the tag map. Replace it with find_first_zero_bit(). With this patch, blk_queue_star_tag() doesn't need to fill remains of tag map with 1, thus allowing it to work properly with the next remove_real_max_depth patch. Signed-off-by: Tejun Heo <htejun@gmail.com> Acked-by: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23[PATCH] NUMA aware block device control structure allocationChristoph Lameter
Patch to allocate the control structures for for ide devices on the node of the device itself (for NUMA systems). The patch depends on the Slab API change patch by Manfred and me (in mm) and the pcidev_to_node patch that I posted today. Does some realignment too. Signed-off-by: Justin M. Forbes <jmforbes@linuxtx.org> Signed-off-by: Christoph Lameter <christoph@lameter.com> Signed-off-by: Pravin Shelar <pravin@calsoftinc.com> Signed-off-by: Shobhit Dayal <shobhit@calsoftinc.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-20[PATCH] sysfs: (driver/block) if show/store is missing return -EIODmitry Torokhov
sysfs: fix drivers/block so if an attribute doesn't implement show or store method read/write will return -EIO instead of 0 or -EINVAL. Signed-off-by: Dmitry Torokhov <dtor@mail.ru> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20[PATCH] ll_rw_blk.c kerneldoc updatesChristoph Hellwig
The recent mapping changes didn't update the kerneldoc appropriately. Original from Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@suse.de>
2005-06-20[PATCH] update blk_execute_rq to take an at_head parameterJames Bottomley
Original From: Mike Christie <michaelc@cs.wisc.edu> Modified to split out block changes (this patch) and SCSI pieces. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-06-20[PATCH] Add scatter-gather support for the block layer SG_IOJames Bottomley
Signed-off-by: Jens Axboe <axboe@suse.de>
2005-06-20[PATCH] Cleanup blk_rq_map_* interfacesJens Axboe
Change the blk_rq_map_user() and blk_rq_map_kern() interface to require a previously allocated request to be passed in. This is both more efficient for multiple iterations of mapping data to the same request, and it is also a much nicer API. Signed-off-by: Jens Axboe <axboe@suse.de>
2005-06-20[PATCH] Keep the bio end_io parts inside of bio.c for blk_rq_map_kern()Jens Axboe
Signed-off-by: Jens Axboe <axboe@suse.de>
2005-06-20[PATCH] Add blk_rq_map_kern()Mike Christie
Add blk_rq_map_kern which takes a kernel buffer and maps it into a request and bio. This can be used by the dm hw_handlers, old sg_scsi_ioctl, and one day scsi special requests so all requests comming into scsi will have bios. All requests having bios should allow scsi to use scatter lists for all IO and allow it to use block layer functions. Signed-off-by: Jens Axboe <axboe@suse.de>
2005-05-20[SCSI] remove requeue feature from blk_insert_request()Tejun Heo
blk_insert_request() has a unobivous feature of requeuing a request setting REQ_SPECIAL|REQ_SOFTBARRIER. SCSI midlayer was the only user and as previous patches removed the usage, remove the feature from blk_insert_request(). Only special requests should be queued with blk_insert_request(). All requeueing should go through blk_requeue_request(). Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-04-16[PATCH] fix NMI lockup with CFQ scheduler
The current problem seen is that the queue lock is actually in the SCSI device structure, so when that structure is freed on device release, we go boom if the queue tries to access the lock again. The fix here is to move the lock from the scsi_device to the queue. Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-04-16[PATCH] use cheaper elv_queue_empty when unplug a deviceKen Chen
In function __generic_unplug_device(), kernel can use a cheaper function elv_queue_empty() instead of more expensive elv_next_request to find whether the queue is empty or not. blk_run_queue can also made conditional on whether queue's emptiness before calling request_fn(). Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-16[PATCH] possible use-after-free of bioJens Axboe
There is a possibility that a bio will be accessed after it has been freed on SCSI. It happens if you submit a bio with BIO_SYNC marked and the auto-unplugging kicks the request_fn, SCSI re-enables interrupts in-between so if the request completes between the add_request() in __make_request() and the bio_sync() call, we could be looking at a dead bio. It's a slim race, but it has been triggered in the Real World. So assign bio_sync() to a local variable instead. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-16Linux-2.6.12-rc2Linus Torvalds
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!