aboutsummaryrefslogtreecommitdiff
path: root/include/linux/genhd.h
AgeCommit message (Collapse)Author
2008-10-09block: allow disk to have extended device numberTejun Heo
Now that disk and partition handlings are mostly unified, it's easy to allow disk to have extended device number. This patch makes add_disk() use extended device number if disk->minors is zero. Both sd and ide-disk are updated to use this. * sd_format_disk_name() is implemented which can generically determine the drive name. This removes disk number restriction stemming from limited device names. * If sd index goes over SD_MAX_DISKS (which can be increased now BTW), sd simply doesn't initialize minors letting block layer choose extended device number. * If CONFIG_DEBUG_EXT_DEVT is set, both sd and ide-disk always set minors to 0 and use extended device numbers. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: replace @ext_minors with GENHD_FL_EXT_DEVTTejun Heo
With previous changes, it's meaningless to limit the number of partitions. Replace @ext_minors with GENHD_FL_EXT_DEVT such that setting the flag allows the disk to have maximum number of allowed partitions (only limited by the number of entries in parsed_partitions as determined by MAX_PART constant). This kills not-too-pretty alloc_disk_ext[_node]() functions and makes @minors parameter to alloc_disk[_node]() unnecessary. The parameter is left alone to avoid disturbing the users. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: make partition array dynamicTejun Heo
disk->__part used to be statically allocated to the maximum possible number of partitions. This patch makes partition array allocation dynamic. The added overhead is minimal as only real change is one memory dereference changed to RCU one. This saves both a bit of memory and cpu cycles iterating through unoccupied slots and makes increasing partition limit easier. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: move stats from disk to part0Tejun Heo
Move stats related fields - stamp, in_flight, dkstats - from disk to part0 and unify stat handling such that... * part_stat_*() now updates part0 together if the specified partition is not part0. ie. part_stat_*() are now essentially all_stat_*(). * {disk|all}_stat_*() are gone. * part_round_stats() is updated similary. It handles part0 stats automatically and disk_round_stats() is killed. * part_{inc|dec}_in_fligh() is implemented which automatically updates part0 stats for parts other than part0. * disk_map_sector_rcu() is updated to return part0 if no part matches. Combined with the above changes, this makes NULL special case handling in callers unnecessary. * Separate stats show code paths for disk are collapsed into part stats show code paths. * Rename disk_stat_lock/unlock() to part_stat_lock/unlock() While at it, reposition stat handling macros a bit and add missing parentheses around macro parameters. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: kill GENHD_FL_FAIL and use part0->make_it_failTejun Heo
GENHD_FL_FAIL for disk is what make_it_fail is for parts. Kill it and use part0->make_it_fail. Sysfs node handling is unified too. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: always set bdev->bd_partTejun Heo
Till now, bdev->bd_part is set only if the bdev was for parts other than part0. This patch makes bdev->bd_part always set so that code paths don't have to differenciate common handling. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: move holder_dir from disk to part0Tejun Heo
Move disk->holder_dir to part0->holder_dir. Kill now mostly superflous bdev_get_holder(). While at it, kill superflous kobject_get/put() around holder_dir, slave_dir and cmd_filter creation and collapse disk_sysfs_add_subdirs() into register_disk(). These serve no purpose but obfuscating the code. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: move policy from disk to part0Tejun Heo
Move disk->policy to part0->policy. Implement and use get_disk_ro(). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: unify sysfs size node handlingTejun Heo
Now that capacity and __dev are moved to part0, part0 and others can share the same method. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: move __dev from disk to part0Tejun Heo
Move disk->__dev to part0->__dev. This simplifies bdget_disk() and lookup_devt() and allows common sysfs attributes to be unified. part_to_disk() is updated to handle part0 -> disk. Updated to include a fix from Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>, he writes: "part0 is a "special" partition and doesn't need to have capacity set - this fixes regression caused by "block: move __dev from disk to part0" commit." Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: move capacity from disk to part0Tejun Heo
Move disk->capacity to part0->nr_sects and convert all users who directly accessed the field to use {get|set}_capacity(). This is done early to allow the __dev field to be moved. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: introduce partition 0Tejun Heo
genhd and partition code handled disk and partitions separately. All information about the whole disk was in struct genhd and partitions in struct hd_struct. However, the whole disk (part0) and other partitions have a lot in common and the data structures end up having good number of common fields and thus separate code paths doing the same thing. Also, the partition array was indexed by partno - 1 which gets pretty confusing at times. This patch introduces partition 0 and makes the partition array indexed by partno. Following patches will unify the handling of disk and parts piece-by-piece. This patch also implements disk_partitionable() which tests whether a disk is partitionable. With coming dynamic partition array change, the most common usage of disk_max_parts() will be testing whether a disk is partitionable and the number of max partitions will become much less important. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: implement and use {disk|part}_to_dev()Tejun Heo
Implement {disk|part}_to_dev() and use them to access generic device instead of directly dereferencing {disk|part}->dev. To make sure no user is left behind, rename generic devices fields to __dev. This is in preparation of unifying partition 0 handling with other partitions. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: implement extended dev numbersTejun Heo
Implement extended device numbers. A block driver can tell block layer that it wants to use extended device numbers. After the usual minor space is used up, block layer automatically allocates devt's from EXT_BLOCK_MAJOR. Currently only one major number is allocated for this but as the allocation is strictly on-demand, ~1mil minor space under it should suffice unless the system actually has more than ~1mil partitions and if that ever happens adding more majors to the extended devt area is easy. Due to internal implementation issues, the first partition can't be allocated on the extended area. In other words, genhd->minors should at least be 1. This limitation will be lifted by later changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: fix diskstats accessTejun Heo
There are two variants of stat functions - ones prefixed with double underbars which don't care about preemption and ones without which disable preemption before manipulating per-cpu counters. It's unclear whether the underbarred ones assume that preemtion is disabled on entry as some callers don't do that. This patch unifies diskstats access by implementing disk_stat_lock() and disk_stat_unlock() which take care of both RCU (for partition access) and preemption (for per-cpu counter access). diskstats access should always be enclosed between the two functions. As such, there's no need for the versions which disables preemption. They're removed and double underbars ones are renamed to drop the underbars. As an extra argument is added, there's no danger of using the old version unconverted. disk_stat_lock() uses get_cpu() and returns the cpu index and all diskstat functions which access per-cpu counters now has @cpu argument to help RT. This change adds RCU or preemption operations at some places but also collapses several preemption ops into one at others. Overall, the performance difference should be negligible as all involved ops are very lightweight per-cpu ones. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: fix disk->part[] dereferencing raceTejun Heo
disk->part[] is protected by its matching bdev's lock. However, non-critical accesses like collecting stats and printing out sysfs and proc information used to be performed without any locking. As partitions can come and go dynamically, partitions can go away underneath those non-critical accesses. As some of those accesses are writes, this theoretically can lead to silent corruption. This patch fixes the race by using RCU for the partition array and dev reference counter to hold partitions. * Rename disk->part[] to disk->__part[] to make sure no one outside genhd layer proper accesses it directly. * Use RCU for disk->__part[] dereferencing. * Implement disk_{get|put}_part() which can be used to get and put partitions from gendisk respectively. * Iterators are implemented to help iterate through all partitions safely. * Functions which require RCU readlock are marked with _rcu suffix. * Use disk_put_part() in __blkdev_put() instead of directly putting the contained kobject. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: don't depend on consecutive minor spaceTejun Heo
* Implement disk_devt() and part_devt() and use them to directly access devt instead of computing it from ->major and ->first_minor. Note that all references to ->major and ->first_minor outside of block layer is used to determine devt of the disk (the part0) and as ->major and ->first_minor will continue to represent devt for the disk, converting these users aren't strictly necessary. However, convert them for consistency. * Implement disk_max_parts() to avoid directly deferencing genhd->minors. * Update bdget_disk() such that it doesn't assume consecutive minor space. * Move devt computation from register_disk() to add_disk() and make it the only one (all other usages use the initially determined value). These changes clean up the code and will help disk->part dereference fix and extended block device numbers. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: make variable and argument names more consistentTejun Heo
In hd_struct, @partno is used to denote partition number and a number of other places use @part to denote hd_struct. Functions use @part and @index instead. This causes confusion and makes it difficult to use consistent variable names for hd_struct. Always use @partno if a variable represents partition number. Also, print out functions use @f or @part for seq_file argument. Use @seqf uniformly instead. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09block: misc updatesTejun Heo
This patch makes the following misc updates in preparation for disk->part dereference fix and extended block devt support. * implment part_to_disk() * fix comment about gendisk->part indexing * rename get_part() to disk_map_sector() * don't use n which is always zero while printing disk information in diskstats_show() Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-08-27block: move cmdfilter from gendisk to request_queueFUJITA Tomonori
cmd_filter works only for the block layer SG_IO with SCSI block devices. It breaks scsi/sg.c, bsg, and the block layer SG_IO with SCSI character devices (such as st). We hit a kernel crash with them. The problem is that cmd_filter code accesses to gendisk (having struct blk_scsi_cmd_filter) via inode->i_bdev->bd_disk. It works for only SCSI block device files. With character device files, inode->i_bdev leads you to struct cdev. inode->i_bdev->bd_disk->blk_scsi_cmd_filter isn't safe. SCSI ULDs don't expose gendisk; they keep it private. bsg needs to be independent on any protocols. We shouldn't change ULDs to expose their gendisk. This patch moves struct blk_scsi_cmd_filter from gendisk to request_queue, a common object, which eveyone can access to. The user interface doesn't change; users can change the filters via /sys/block/. gendisk has a pointer to request_queue so the cmd_filter code accesses to struct blk_scsi_cmd_filter. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-07-25fs/partition/check.c: fix return value warningAbdel Benamrouche
fs/partitions/check.c:381: warning: ignoring return value of ___device_add___, declared with attribute warn_unused_result [akpm@linux-foundation.org: multiple-return-statements-per-function are evil] Signed-off-by: Abdel Benamrouche <draconux@gmail.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-03allow userspace to modify scsi command filter on per device basisAdel Gadllah
This patch exports the per-gendisk command filter to user space through sysfs, so it can be changed by the system administrator. All users of the old cmd filter have been converted to use the new one. Original patch from Peter Jones. Signed-off-by: Adel Gadllah <adel.gadllah@gmail.com> Signed-off-by: Peter Jones <pjones@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-07-03block: Block layer data integrity supportMartin K. Petersen
Some block devices support verifying the integrity of requests by way of checksums or other protection information that is submitted along with the I/O. This patch implements support for generating and verifying integrity metadata, as well as correctly merging, splitting and cloning bios and requests that have this extra information attached. See Documentation/block/data-integrity.txt for more information. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-05-14block: do_mounts - accept root=<non-existant partition>Kay Sievers
Some devices, like md, may create partitions only at first access, so allow root= to be set to a valid non-existant partition of an existing disk. This applies only to non-initramfs root mounting. This fixes a regression from 2.6.24 which did allow this to happen and broke some users machines :( Acked-by: Neil Brown <neilb@suse.de> Tested-by: Joao Luis Meloni Assirati <assirati@nonada.if.usp.br> Cc: stable <stable@kernel.org> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-05-07block: avoid duplicate calls to get_part() in disk stat codeJens Axboe
get_part() is fairly expensive, as it O(N) loops over partitions to find the right one. In lots of normal IO paths we end up looking up the partition twice, to make matters even worse. Change the stat add code to accept a passed in partition instead. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-03-12Remove <linux/genhd.h> from user-visible headers.David Woodhouse
It was all wrapped in '#ifdef CONFIG_BLOCK' anyway, so userspace was getting nothing useful out of it. And the special #ifndef __KERNEL__ version of 'struct partition' makes me inclined to promote an attitude of violence... Stick some comments on some of the #endifs too, while we're at it. Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04block/genhd.c: proper externsAdrian Bunk
This patch adds proper externs for two structs in include/linux/genhd.h Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-03-04block/genhd.c: cleanupsAdrian Bunk
This patch contains the following cleanups: - make the needlessly global struct disk_type static - #if 0 the unused genhd_media_change_notify() Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-02-08Enhanced partition statistics: remove old partition statisticsJerome Marchand
Removes the now unused old partition statistic code. Signed-off-by: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-02-08Enhanced partition statistics: update partition statiticsJerome Marchand
Updates the enhanced partition statistics in generic block layer besides the disk statistics. Signed-off-by: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-02-08Enhanced partition statistics: core statisticsJerome Marchand
This patch contain the core infrastructure of enhanced partition statistics. It adds to struct hd_struct the same stats data as struct gendisk and define basics function to manipulate them. Signed-off-by: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-24Driver core: convert block from raw kobjects to core devicesKay Sievers
This moves the block devices to /sys/class/block. It will create a flat list of all block devices, with the disks and partitions in one directory. For compatibility /sys/block is created and contains symlinks to the disks. /sys/class/block |-- sda -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda |-- sda1 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda1 |-- sda10 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda10 |-- sda5 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda5 |-- sda6 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda6 |-- sda7 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda7 |-- sda8 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda8 |-- sda9 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda9 `-- sr0 -> ../../devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0 /sys/block/ |-- sda -> ../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda `-- sr0 -> ../devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0 Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-30[PARTITION] MSDOS: Fix Sun num_partitions handling.Mark Fortescue
Correct the Solaris x86 number of partitions (slices) is a way that is backward compatible with the earlier size. This works without a new VTOC structure definition as the timestamp and v_asciilabel fields in the VTOC are not used by the kernel yet. Signed-off-by: Mark Fortescue <mark@mtfhpc.demon.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-23genhd: send async notification on media changeKristen Carlson Accardi
Send an uevent to user space to indicate that a media change event has occurred. Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-23genhd: expose AN to user spaceKristen Carlson Accardi
Allow user space to determine if a disk supports Asynchronous Notification of media changes. This is done by adding a new sysfs file "capability_flags", which is documented in (insert file name). This sysfs file will export all disk capabilities flags to user space. We also define a new flag to define the media change notification capability. Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11Fix compile/link of init/do_mounts.c with !CONFIG_BLOCKJens Axboe
We need a stub function for when CONFIG_BLOCK isn't set. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-05-09Display all possible partitions when the root filesystem failed to mountDave Gilbert
Display all possible partitions when the root filesystem is not mounted. This helps to track spell'o's and missing drivers. Updated to work with newer kernels. Example output: VFS: Cannot open root device "foobar" or unknown-block(0,0) Please append a correct "root=" boot option; here are the available partitions: 0800 8388608 sda driver: sd 0801 192748 sda1 0802 8193150 sda2 0810 4194304 sdb driver: sd Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) [akpm@linux-foundation.org: cleanups, fix printk warnings] Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Cc: Dave Gilbert <linux@treblig.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC64]: Update defconfig. [SPARC64]: Add PCI MSI support on Niagara. [SPARC64] IRQ: Use irq_desc->chip_data instead of irq_desc->handler_data [SPARC64]: Add obppath sysfs attribute for SBUS and PCI devices. [PARTITION]: Add whole_disk attribute.
2007-02-11[PATCH] relax check for AIX in msdos partition tableOlaf Hering
The patch to identify AIX disks and ignore them has caused at least one machine to fail to find the root partition on 2.6.19. The patch is: http://lkml.org/lkml/2006/7/31/117 The problem is some disk formatters do not blow away the first 4 bytes of the disk. If the disk we are installing to used to have AIX on it, then the first 4 bytes will still have IBMA in EBCDIC. The install in question was debian etch. Im not sure what the best fix is, perhaps the AIX detection code could check more than the first 4 bytes. The whole partition info for primary partitions is in this block: dd if=/dev/sdb count=$(( 4 * 16 )) bs=1 skip=$(( 0x1be )) All other data do not matter, beside the 0x55aa marker at the end of the first block. Signed-off-by: Olaf Hering <olh@suse.de> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Anton Blanchard <anton@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-10[PARTITION]: Add whole_disk attribute.Fabio Massimo Di Nitto
Some partitioning systems create special partitions that span the entire disk. One example are Sun partitions, and this whole-disk partition exists to tell the firmware the extent of the entire device so it can load the boot block and do other things. Such partitions should not be treated as normal partitions, because all the other partitions overlap this whole-disk one. So we'd see multiple instances of the same UUID etc. which we do not want. udev and friends can thus search for this 'whole_disk' attribute and use it to decide to ignore the partition. Signed-off-by: Fabio Massimo Di Nitto <fabbione@ubuntu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-08[PATCH] fault-injection capability for disk IOAkinobu Mita
This patch provides fault-injection capability for disk IO. Boot option: fail_make_request=<probability>,<interval>,<space>,<times> <interval> -- specifies the interval of failures. <probability> -- specifies how often it should fail in percent. <space> -- specifies the size of free space where disk IO can be issued safely in bytes. <times> -- specifies how many times failures may happen at most. Debugfs: /debug/fail_make_request/interval /debug/fail_make_request/probability /debug/fail_make_request/specifies /debug/fail_make_request/times Example: fail_make_request=10,100,0,-1 echo 1 > /sys/blocks/hda/hda1/make-it-fail generic_make_request() on /dev/hda1 fails once per 10 times. Cc: Jens Axboe <axboe@suse.de> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-30[PATCH] BLOCK: Make it possible to disable the block layer [try #6]David Howells
Make it possible to disable the block layer. Not all embedded devices require it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require the block layer to be present. This patch does the following: (*) Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev support. (*) Adds dependencies on CONFIG_BLOCK to any configuration item that controls an item that uses the block layer. This includes: (*) Block I/O tracing. (*) Disk partition code. (*) All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS. (*) The SCSI layer. As far as I can tell, even SCSI chardevs use the block layer to do scheduling. Some drivers that use SCSI facilities - such as USB storage - end up disabled indirectly from this. (*) Various block-based device drivers, such as IDE and the old CDROM drivers. (*) MTD blockdev handling and FTL. (*) JFFS - which uses set_bdev_super(), something it could avoid doing by taking a leaf out of JFFS2's book. (*) Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and linux/elevator.h contingent on CONFIG_BLOCK being set. sector_div() is, however, still used in places, and so is still available. (*) Also made contingent are the contents of linux/mpage.h, linux/genhd.h and parts of linux/fs.h. (*) Makes a number of files in fs/ contingent on CONFIG_BLOCK. (*) Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK. (*) set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK is not enabled. (*) fs/no-block.c is created to hold out-of-line stubs and things that are required when CONFIG_BLOCK is not set: (*) Default blockdev file operations (to give error ENODEV on opening). (*) Makes some /proc changes: (*) /proc/devices does not list any blockdevs. (*) /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK. (*) Makes some compat ioctl handling contingent on CONFIG_BLOCK. (*) If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if given command other than Q_SYNC or if a special device is specified. (*) In init/do_mounts.c, no reference is made to the blockdev routines if CONFIG_BLOCK is not defined. This does not prohibit NFS roots or JFFS2. (*) The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return error ENOSYS by way of cond_syscall if so). (*) The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if CONFIG_BLOCK is not set, since they can't then happen. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-06-26[PATCH] devfs: Remove the gendisk devfs_name field as it's no longer neededGreg Kroah-Hartman
And remove the now unneeded number field. Also fixes all drivers that set these fields. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-04-25Include various private files only from within __KERNEL__ in genhd.hDavid Woodhouse
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-04-14[PATCH] BLOCK: delay all uevents until partition table is scannedKay Sievers
[BLOCK] delay all uevents until partition table is scanned Here we delay the annoucement of all block device events until the disk's partition table is scanned and all partition devices are already created and sysfs is populated. We have a bunch of old bugs for removable storage handling where we probe successfully for a filesystem on the raw disk, but at the same time the kernel recognizes a partition table and creates partition devices. Currently there is no sane way to tell if partitions will show up or not at the time the disk device is announced to userspace. With the delayed events we can simply skip any probe for a filesystem on the raw disk when we find already present partitions. Signed-off-by: Kay Sievers <kay.sievers@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-03-28[PATCH] for_each_possible_cpu: fixes for generic partKAMEZAWA Hiroyuki
replaces for_each_cpu with for_each_possible_cpu(). Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27Merge branch 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-blockLinus Torvalds
* 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-block: [PATCH] Don't make debugfs depend on DEBUG_KERNEL [PATCH] Fix blktrace compile with sysfs not defined [PATCH] unused label in drivers/block/cciss. [BLOCK] increase size of disk stat counters [PATCH] blk_execute_rq_nowait-speedup [PATCH] ide-cd: quiet down GPCMD_READ_CDVD_CAPACITY failure [BLOCK] ll_rw_blk: kmalloc -> kzalloc conversion [PATCH] kzalloc() conversion in drivers/block [PATCH] update max_sectors documentation
2006-03-27[PATCH] dm-md-dependency-tree-in-sysfs-holders-slaves-subdirectory-tidyAndrew Morton
Remove all the CONFIG_SYSFS stuff. That's supposed to all be implemented up in header files. Yes, the CONFIG_SYSFS=n data structures will be a little larger than necessary, but that's a tradeoff we can decide to make. Cc: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27[PATCH] dm/md dependency tree in sysfs: holders/slaves subdirectoryJun'ichi Nomura
Creating "slaves" and "holders" directories in /sys/block/<disk> and creating "holders" directory under /sys/block/<disk>/<partition> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27[BLOCK] increase size of disk stat countersBen Woodard
The kernel's representation of the disk statistics uses the type unsigned which is 32b on both 32b and 64b platforms. Unfortunately, most system tools that work with these numbers that are exported in /proc/diskstats including iostat read these numbers into unsigned longs. This works fine on 32b platforms and when the number of IO transactions are small on 64b platforms. However, when the numbers wrap on 64b platforms & you read the numbers into unsigned longs, and compare the numbers to previous readings, then you get an unsigned representation of a negative number. This looks like a very large 64b number & gives you bizarre readouts in iostat: ilc4: Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util ilc4: sda 5.50 0.00 143.96 0.00 307496983987862656.00 0.00 153748491993931328.00 0.00 2136028725038430.00 7.94 55.12 5.59 80.42 Though fixing iostat in user space is possible, and a quick survey indicates that several other similar tools also use unsigned longs when processing /proc/diskstats. Therefore, it seems like a better approach would be to extend the length of the disk_stats structure on 64b architectures to 64b. The following patch does that. It should not affect the operation on 32b platforms. Signed-off-by: Ben Woodard <woodard@redhat.com> Cc: Rick Lindsley <ricklind@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Jens Axboe <axboe@suse.de>