aboutsummaryrefslogtreecommitdiff
path: root/block
AgeCommit message (Collapse)Author
2009-10-28cfq-iosched: adapt slice to number of processes doing I/OCorrado Zoccolo
When the number of processes performing I/O concurrently increases, a fixed time slice per process will cause large latencies. This patch, if low_latency mode is enabled, will scale the time slice assigned to each process according to a 300ms target latency. In order to keep fairness among processes: * The number of active processes is computed using a special form of running average, that quickly follows sudden increases (to keep latency low), and decrease slowly (to have fairness in spite of rapid decreases of this value). To safeguard sequential bandwidth, we impose a minimum time slice (computed using 2*cfq_slice_idle as base, adjusted according to priority and async-ness). Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-27cfq-iosched: improve hw_tag detectionShaohua Li
If active queue hasn't enough requests and idle window opens, cfq will not dispatch sufficient requests to hardware. In such situation, current code will zero hw_tag. But this is because cfq doesn't dispatch enough requests instead of hardware queue doesn't work. Don't zero hw_tag in such case. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-26cfq: break apart merged cfqqs if they stop cooperatingJeff Moyer
cfq_queues are merged if they are issuing requests within the mean seek distance of one another. This patch detects when the coopearting stops and breaks the queues back up. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-26cfq: change the meaning of the cfqq_coop flagJeff Moyer
The flag used to indicate that a cfqq was allowed to jump ahead in the scheduling order due to submitting a request close to the queue that just executed. Since closely cooperating queues are now merged, the flag holds little meaning. Change it to indicate that multiple queues were merged. This will later be used to allow the breaking up of merged queues when they are no longer cooperating. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-26cfq: merge cooperating cfq_queuesJeff Moyer
When cooperating cfq_queues are detected currently, they are allowed to skip ahead in the scheduling order. It is much more efficient to automatically share the cfq_queue data structure between cooperating processes. Performance of the read-test2 benchmark (which is written to emulate the dump(8) utility) went from 12MB/s to 90MB/s on my SATA disk. NFS servers with multiple nfsd threads also saw performance increases. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-26cfq: calculate the seek_mean per cfq_queue not per cfq_io_contextJeff Moyer
async cfq_queue's are already shared between processes within the same priority, and forthcoming patches will change the mapping of cic to sync cfq_queue from 1:1 to 1:N. So, calculate the seekiness of a process based on the cfq_queue instead of the cfq_io_context. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-24block: silently error unsupported empty barriers tooMark McLoughlin
With 2.6.32-rc5 in a KVM guest using dm and virtio_blk, we see the following errors: end_request: I/O error, dev vda, sector 0 end_request: I/O error, dev vda, sector 0 The errors go away if dm stops submitting empty barriers, by reverting: commit 52b1fd5a27c625c78373e024bf570af3c9d44a79 Author: Mikulas Patocka <mpatocka@redhat.com> dm: send empty barriers to targets in dm_flush We should silently error all barriers, even empty barriers, on devices like virtio_blk which don't support them. See also: https://bugzilla.redhat.com/514901 Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Alasdair G Kergon <agk@redhat.com> Acked-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Neil Brown <neilb@suse.de> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-13Merge branch 'for-linus' into for-2.6.33Jens Axboe
2009-10-12blk-settings: fix function parameter kernel-doc notationRandy Dunlap
Fix kernel-doc notation in blk-settings.c::blk_queue_max_discard_sectors(). Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-09elv_iosched_store(): fix strstrip() misuseKOSAKI Motohiro
elv_iosched_store() ignore the return value of strstrip(). It makes small inconsistent behavior. This patch fixes it. <before> ==================================== # cd /sys/block/{blockdev}/queue case1: # echo "anticipatory" > scheduler # cat scheduler noop [anticipatory] deadline cfq case2: # echo "anticipatory " > scheduler # cat scheduler noop [anticipatory] deadline cfq case3: # echo " anticipatory" > scheduler bash: echo: write error: Invalid argument <after> ==================================== # cd /sys/block/{blockdev}/queue case1: # echo "anticipatory" > scheduler # cat scheduler noop [anticipatory] deadline cfq case2: # echo "anticipatory " > scheduler # cat scheduler noop [anticipatory] deadline cfq case3: # echo " anticipatory" > scheduler noop [anticipatory] deadline cfq Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-08cfq-iosched: avoid probable slice overrun when idlingCorrado Zoccolo
If the average think time is larger than the remaining time slice for any given queue, don't allow it to idle. A succesful idle also means that we need to dispatch and complete a request, so if we don't even have time left for the idle process, we would overrun the slice in any case. Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-07cfq-iosched: apply bool value where we return 0/1Jens Axboe
Saves 16 bytes of text, woohoo. But the more important point is that it makes the code more readable when returning bool for 0/1 cases. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-07cfq-iosched: fix think time allowed for seekersCorrado Zoccolo
CFQ enables idle only for processes that think less than the allowed idle time. Since idle time is lower for seeky queues, we should use the correct value in the comparison. Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-06cfq-iosched: fix the slice residual signJens Axboe
We should subtract the slice residual from the rb tree key, since a negative residual count indicates that the cfqq overran its slice the last time. Hence we want to add the overrun time, to position it a bit further away in the service tree. Reported-by: Corrado Zoccolo <czoccolo@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-06cfq-iosched: abstract out the 'may this cfqq dispatch' logicJens Axboe
Makes the whole thing easier to read, cfq_dispatch_requests() was a bit messy before. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-06block: use proper BLK_RW_ASYNC in blk_queue_start_tag()Jens Axboe
Makes it easier to read than the 0. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-06block: Seperate read and write statistics of in_flight requests v2Nikanth Karthikesan
Commit a9327cac440be4d8333bba975cbbf76045096275 added seperate read and write statistics of in_flight requests. And exported the number of read and write requests in progress seperately through sysfs. But Corrado Zoccolo <czoccolo@gmail.com> reported getting strange output from "iostat -kx 2". Global values for service time and utilization were garbage. For interval values, utilization was always 100%, and service time is higher than normal. So this was reverted by commit 0f78ab9899e9d6acb09d5465def618704255963b The problem was in part_round_stats_single(), I missed the following: if (now == part->stamp) return; - if (part->in_flight) { + if (part_in_flight(part)) { __part_stat_add(cpu, part, time_in_queue, part_in_flight(part) * (now - part->stamp)); __part_stat_add(cpu, part, io_ticks, (now - part->stamp)); With this chunk included, the reported regression gets fixed. Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> -- Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-05block: get rid of kblock_schedule_delayed_work()Jens Axboe
It was briefly introduced to allow CFQ to to delayed scheduling, but we ended up removing that feature again. So lets kill the function and export, and just switch CFQ back to the normal work schedule since it is now passing in a '0' delay from all call sites. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-05cfq-iosched: fix possible problem with jiffies wraparoundCorrado Zoccolo
The RR service tree is indexed by a key that is relative to current jiffies. This can cause problems on jiffies wraparound. The patch fixes it using time_before comparison, and changing the add_front path to use a relative number, too. Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-05cfq-iosched: fix issue with rq-rq merging and fifo list orderingJens Axboe
cfq uses rq->start_time as the fifo indicator, but that field may get modified prior to cfq doing it's fifo list adjustment when a request gets merged with another request. This can cause the fifo list to become unordered. Reported-by: Corrado Zoccolo <czoccolo@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-05Merge branch 'master' into for-2.6.33Jens Axboe
2009-10-04Revert "Seperate read and write statistics of in_flight requests"Jens Axboe
This reverts commit a9327cac440be4d8333bba975cbbf76045096275. Corrado Zoccolo <czoccolo@gmail.com> reports: "with 2.6.32-rc1 I started getting the following strange output from "iostat -kx 2": Linux 2.6.31bisect (et2) 04/10/2009 _i686_ (2 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 10,70 0,00 3,16 15,75 0,00 70,38 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda 18,22 0,00 0,67 0,01 14,77 0,02 43,94 0,01 10,53 39043915,03 2629219,87 sdb 60,89 9,68 50,79 3,04 1724,43 50,52 65,95 0,70 13,06 488437,47 2629219,87 avg-cpu: %user %nice %system %iowait %steal %idle 2,72 0,00 0,74 0,00 0,00 96,53 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 sdb 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 avg-cpu: %user %nice %system %iowait %steal %idle 6,68 0,00 0,99 0,00 0,00 92,33 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 sdb 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 avg-cpu: %user %nice %system %iowait %steal %idle 4,40 0,00 0,73 1,47 0,00 93,40 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 sdb 0,00 4,00 0,00 3,00 0,00 28,00 18,67 0,06 19,50 333,33 100,00 Global values for service time and utilization are garbage. For interval values, utilization is always 100%, and service time is higher than normal. I bisected it down to: [a9327cac440be4d8333bba975cbbf76045096275] Seperate read and write statistics of in_flight requests and verified that reverting just that commit indeed solves the issue on 2.6.32-rc1." So until this is debugged, revert the bad commit. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-04cfq-iosched: don't delay async queue if it hasn't dispatched at allJens Axboe
We cannot delay for the first dispatch of the async queue if it hasn't dispatched at all, since that could present a local user DoS attack vector using an app that just did slow timed sync reads while filling memory. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-03block: Topology ioctlsMartin K. Petersen
Not all users of the topology information want to use libblkid. Provide the topology information through bdev ioctls. Also clarify sector size comments for existing BLK ioctls. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-03cfq-iosched: use assigned slice sync value, not defaultJens Axboe
We should use the sysfs modified slice sync value, in case it differs from the default. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-03cfq-iosched: rename 'desktop' sysfs entry to 'low_latency'Jens Axboe
Don't think that's necessarily a perfect description of what this option fiddles with, but it's probably better than 'desktop'. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-03cfq-iosched: implement slower async initiate and queue ramp upJens Axboe
This slowly ramps up the async queue depth based on the time passed since the sync IO, and doesn't allow async at all until a sync slice period has passed. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-03cfq-iosched: delay async IO dispatch, if sync IO was just doneVivek Goyal
o Do not allow more than max_dispatch requests from an async queue, if some sync request has finished recently. This is in the hope that sync activity is still going on in the system and we might receive a sync request soon. Most likely from a sync queue which finished a request and we did not enable idling on it. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-03block: CFQ is more than a desktop schedulerJens Axboe
Update Kconfig.iosched entry. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-03block: remove the anticipatory IO schedulerJens Axboe
AS is mostly a subset of CFQ, so there's little point in still providing this separate IO scheduler. Hopefully at some point we can get down to one single IO scheduler again, at least this brings us closer by having only one intelligent IO scheduler. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-02cfq-iosched: add a knob for desktop interactivenessJens Axboe
This is basically identical to what Vivek Goyal posted, but combined into one and labelled 'desktop' instead of 'fairness'. The goal is to continue to improve on the latency side of things as it relates to interactiveness, keeping the questionable bits under this sysfs tunable so it would be easy for throughput-only people to turn off. Apart from adding the interactive sysfs knob, it also adds the behavioural change of allowing slice idling even if the hardware does tagged command queuing. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01Add a tracepoint for block request remappingJun'ichi Nomura
Since 2.6.31 now has request-based device-mapper, it's useful to have a tracepoint for request-remapping as well as bio-remapping. This patch adds a tracepoint for request-remapping, trace_block_rq_remap(). Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: Alasdair G Kergon <agk@redhat.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01block: allow large discard requestsChristoph Hellwig
Currently we set the bio size to the byte equivalent of the blocks to be trimmed when submitting the initial DISCARD ioctl. That means it is subject to the max_hw_sectors limitation of the HBA which is much lower than the size of a DISCARD request we can support. Add a separate max_discard_sectors tunable to limit the size for discard requests. We limit the max discard request size in bytes to 32bit as that is the limit for bio->bi_size. This could be much larger if we had a way to pass that information through the block layer. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01block: use normal I/O path for discard requestsChristoph Hellwig
prepare_discard_fn() was being called in a place where memory allocation was effectively impossible. This makes it inappropriate for all but the most trivial translations of Linux's DISCARD operation to the block command set. Additionally adding a payload there makes the ownership of the bio backing unclear as it's now allocated by the device driver and not the submitter as usual. It is replaced with QUEUE_FLAG_DISCARD which is used to indicate whether the queue supports discard operations or not. blkdev_issue_discard now allocates a one-page, sector-length payload which is the right thing for the common ATA and SCSI implementations. The mtd implementation of prepare_discard_fn() is replaced with simply checking for the request being a discard. Largely based on a previous patch from Matthew Wilcox <matthew@wil.cx> which did the prepare_discard_fn but not the different payload allocation yet. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01Add a tracepoint for block request remappingJun'ichi Nomura
Since 2.6.31 now has request-based device-mapper, it's useful to have a tracepoint for request-remapping as well as bio-remapping. This patch adds a tracepoint for request-remapping, trace_block_rq_remap(). Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: Alasdair G Kergon <agk@redhat.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01block: allow large discard requestsChristoph Hellwig
Currently we set the bio size to the byte equivalent of the blocks to be trimmed when submitting the initial DISCARD ioctl. That means it is subject to the max_hw_sectors limitation of the HBA which is much lower than the size of a DISCARD request we can support. Add a separate max_discard_sectors tunable to limit the size for discard requests. We limit the max discard request size in bytes to 32bit as that is the limit for bio->bi_size. This could be much larger if we had a way to pass that information through the block layer. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01block: use normal I/O path for discard requestsChristoph Hellwig
prepare_discard_fn() was being called in a place where memory allocation was effectively impossible. This makes it inappropriate for all but the most trivial translations of Linux's DISCARD operation to the block command set. Additionally adding a payload there makes the ownership of the bio backing unclear as it's now allocated by the device driver and not the submitter as usual. It is replaced with QUEUE_FLAG_DISCARD which is used to indicate whether the queue supports discard operations or not. blkdev_issue_discard now allocates a one-page, sector-length payload which is the right thing for the common ATA and SCSI implementations. The mtd implementation of prepare_discard_fn() is replaced with simply checking for the request being a discard. Largely based on a previous patch from Matthew Wilcox <matthew@wil.cx> which did the prepare_discard_fn but not the different payload allocation yet. Signed-off-by: Christoph Hellwig <hch@lst.de>
2009-10-01Add missing blk_trace_remove_sysfs to be in pair with blk_trace_init_sysfsZdenek Kabelac
Add missing blk_trace_remove_sysfs to be in pair with blk_trace_init_sysfs introduced in commit 1d54ad6da9192fed5dd3b60224d9f2dfea0dcd82. Release kobject also in case the request_fn is NULL. Problem was noticed via kmemleak backtrace when some sysfs entries were note properly destroyed during device removal: unreferenced object 0xffff88001aa76640 (size 80): comm "lvcreate", pid 2120, jiffies 4294885144 hex dump (first 32 bytes): 01 00 00 00 00 00 00 00 f0 65 a7 1a 00 88 ff ff .........e...... 90 66 a7 1a 00 88 ff ff 86 1d 53 81 ff ff ff ff .f........S..... backtrace: [<ffffffff813f9cc6>] kmemleak_alloc+0x26/0x60 [<ffffffff8111d693>] kmem_cache_alloc+0x133/0x1c0 [<ffffffff81195891>] sysfs_new_dirent+0x41/0x120 [<ffffffff81194b0c>] sysfs_add_file_mode+0x3c/0xb0 [<ffffffff81197c81>] internal_create_group+0xc1/0x1a0 [<ffffffff81197d93>] sysfs_create_group+0x13/0x20 [<ffffffff810d8004>] blk_trace_init_sysfs+0x14/0x20 [<ffffffff8123f45c>] blk_register_queue+0x3c/0xf0 [<ffffffff812447e4>] add_disk+0x94/0x160 [<ffffffffa00d8b08>] dm_create+0x598/0x6e0 [dm_mod] [<ffffffffa00de951>] dev_create+0x51/0x350 [dm_mod] [<ffffffffa00de823>] ctl_ioctl+0x1a3/0x240 [dm_mod] [<ffffffffa00de8f2>] dm_compat_ctl_ioctl+0x12/0x20 [dm_mod] [<ffffffff81177bfd>] compat_sys_ioctl+0xcd/0x4f0 [<ffffffff81036ed8>] sysenter_dispatch+0x7/0x2c [<ffffffffffffffff>] 0xffffffffffffffff Signed-off-by: Zdenek Kabelac <zkabelac@redhat.com> Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01block: Do not clamp max_hw_sectors for stacking devicesMartin K. Petersen
Stacking devices do not have an inherent max_hw_sector limit. Set the default to INT_MAX so we are bounded only by capabilities of the underlying storage. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01block: Set max_sectors correctly for stacking devicesMartin K. Petersen
The topology changes unintentionally caused SAFE_MAX_SECTORS to be set for stacking devices. Set the default limit to BLK_DEF_MAX_SECTORS and provide SAFE_MAX_SECTORS in blk_queue_make_request() for legacy hw drivers that depend on the old behavior. Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-19Driver-Core: extend devnode callbacks to provide permissionsKay Sievers
This allows subsytems to provide devtmpfs with non-default permissions for the device node. Instead of the default mode of 0600, null, zero, random, urandom, full, tty, ptmx now have a mode of 0666, which allows non-privileged processes to access standard device nodes in case no other userspace process applies the expected permissions. This also fixes a wrong assignment in pktcdvd and a checkpatch.pl complain. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: Driver Core: devtmpfs - kernel-maintained tmpfs-based /dev debugfs: Modify default debugfs directory for debugging pktcdvd. debugfs: Modified default dir of debugfs for debugging UHCI. debugfs: Change debugfs directory of IWMC3200 debugfs: Change debuhgfs directory of trace-events-sample.h debugfs: Fix mount directory of debugfs by default in events.txt hpilo: add poll f_op hpilo: add interrupt handler hpilo: staging for interrupt handling driver core: platform_device_add_data(): use kmemdup() Driver core: Add support for compatibility classes uio: add generic driver for PCI 2.3 devices driver-core: move dma-coherent.c from kernel to driver/base mem_class: fix bug mem_class: use minor as index instead of searching the array driver model: constify attribute groups UIO: remove 'default n' from Kconfig Driver core: Add accessor for device platform data Driver core: move dev_get/set_drvdata to drivers/base/dd.c Driver core: add new device to bus's list before probing
2009-09-15driver model: constify attribute groupsDavid Brownell
Let attribute group vectors be declared "const". We'd like to let most attribute metadata live in read-only sections... this is a start. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-15Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (46 commits) powerpc64: convert to dynamic percpu allocator sparc64: use embedding percpu first chunk allocator percpu: kill lpage first chunk allocator x86,percpu: use embedding for 64bit NUMA and page for 32bit NUMA percpu: update embedding first chunk allocator to handle sparse units percpu: use group information to allocate vmap areas sparsely vmalloc: implement pcpu_get_vm_areas() vmalloc: separate out insert_vmalloc_vm() percpu: add chunk->base_addr percpu: add pcpu_unit_offsets[] percpu: introduce pcpu_alloc_info and pcpu_group_info percpu: move pcpu_lpage_build_unit_map() and pcpul_lpage_dump_cfg() upward percpu: add @align to pcpu_fc_alloc_fn_t percpu: make @dyn_size mandatory for pcpu_setup_first_chunk() percpu: drop @static_size from first chunk allocators percpu: generalize first chunk allocator selection percpu: build first chunk allocators selectively percpu: rename 4k first chunk allocator to page percpu: improve boot messages percpu: fix pcpu_reclaim() locking ... Fix trivial conflict as by Tejun Heo in kernel/sched.c
2009-09-14Merge branch 'for-2.6.32' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds
* 'for-2.6.32' of git://git.kernel.dk/linux-2.6-block: (29 commits) block: use blkdev_issue_discard in blk_ioctl_discard Make DISCARD_BARRIER and DISCARD_NOBARRIER writes instead of reads block: don't assume device has a request list backing in nr_requests store block: Optimal I/O limit wrapper cfq: choose a new next_req when a request is dispatched Seperate read and write statistics of in_flight requests aoe: end barrier bios with EOPNOTSUPP block: trace bio queueing trial only when it occurs block: enable rq CPU completion affinity by default cfq: fix the log message after dispatched a request block: use printk_once cciss: memory leak in cciss_init_one() splice: update mtime and atime on files block: make blk_iopoll_prep_sched() follow normal 0/1 return convention cfq-iosched: get rid of must_alloc flag block: use interrupts disabled version of raise_softirq_irqoff() block: fix comment in blk-iopoll.c block: adjust default budget for blk-iopoll block: fix long lines in block/blk-iopoll.c block: add blk-iopoll, a NAPI like approach for block devices ...
2009-09-14block: use blkdev_issue_discard in blk_ioctl_discardChristoph Hellwig
blk_ioctl_discard duplicates large amounts of code from blkdev_issue_discard, the only difference between the two is that blkdev_issue_discard needs to send a barrier discard request and blk_ioctl_discard a non-barrier one, and blk_ioctl_discard needs to wait on the request. To facilitates this add a flags argument to blkdev_issue_discard to control both aspects of the behaviour. This will be very useful later on for using the waiting funcitonality for other callers. Based on an earlier patch from Matthew Wilcox <matthew@wil.cx>. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-14block: don't assume device has a request list backing in nr_requests storeJens Axboe
Stacked devices do not. For now, just error out with -EINVAL. Later we could make the limit apply on stacked devices too, for throttling reasons. This fixes 5a54cd13353bb3b88887604e2c980aa01e314309 and should go into 2.6.31 stable as well. Cc: stable@kernel.org Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-14block: Optimal I/O limit wrapperMartin K. Petersen
Implement blk_limits_io_opt() and make blk_queue_io_opt() a wrapper around it. DM needs this to avoid poking at the queue_limits directly. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-14cfq: choose a new next_req when a request is dispatchedJeff Moyer
This patch addresses http://bugzilla.kernel.org/show_bug.cgi?id=13401, a regression introduced in 2.6.30. From the bug report: Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-14Seperate read and write statistics of in_flight requestsNikanth Karthikesan
Currently, there is a single in_flight counter measuring the number of requests in the request_queue. But some monitoring tools would like to know how many read requests and write requests are in progress. Split the current in_flight counter into two seperate counters for read and write. This information is exported as a sysfs attribute, as changing the currently available stat files would break the existing tools. Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>