aboutsummaryrefslogtreecommitdiff
path: root/drivers/block
AgeCommit message (Collapse)Author
2008-03-04cciss: remove READ_AHEAD define and use block layer defaultsMike Miller
This patch removes the #define READ_AHEAD 1024 from the driver and uses the block layer defaults, instead. We have found that under certain workloads the setting can cause a disk connected to the e200 controller to go offline. If the disk hiccups the link may try to downshift but the controller is never notified that the link successfully completed the renegotiation. We've also found that performance using the block layer default of 32 pages was on par with the 1024 setting. We tried setting it to zero at one time based on info from our firmware guys but that killed performance. Turns out we were talking about 2 different read ahead settings. Please consider this for inclusion. Signed-off-by: Mike Miller <mike.miller@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-03-04resubmit: cciss: procfs updates to display info about manyMike Miller
volumes This patch allows us to display information about all of the logical volumes configured on a particular controller without stepping on memory even when there are many volumes (128 or more) configured. Please consider this for inclusion. Signed-off-by: Mike Miller <mike.miller@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-02-23NBD: make nbd default to deadline I/O schedulerPaul Clements
NBD doesn't work well with CFQ (or AS) schedulers, so let's default to something else. The two problems I have experienced with nbd and cfq are: 1) nbd hangs with cfq on RHEL 5 (2.6.18) -- this may well have been fixed There's a similar debian bug that has been filed as well: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=447638 There have been posts to nbd-general mailing list about problems with cfq and nbd also. 2) nbd performs about 10% better (the last time I tested) with deadline vs. cfq (the overhead of cfq doesn't provide much advantage to nbd [not being a real disk], and you end up going through the I/O scheduler on the nbd server anyway, so it makes sense that deadline is better with nbd) Signed-off-by: Paul Clements <paul.clements@steeleye.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-21xen: Implement getgeo for Xen virtual block device.Ian Campbell
The below implements the getgeo hook for Xen block devices. Extracted from the xen-unstable tree where it has been used for ages. It is useful to have because it allows things like grub2 (used by the Debian installer images) to work in a guest domain without having to sprinkle Xen specific hacks around the place. Signed-off-by: Ian Campbell <ijc@hellion.org.uk> Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14Fix compile of swim3 as moduleTony Breeds
The current pmac32_defconfig fails to build with the following error: Building modules, stage 2. ERROR: "check_media_bay" [drivers/block/swim3.ko] undefined! WARNING: modpost: Found 23 section mismatch(es). To see full details build your kernel with: 'make CONFIG_DEBUG_SECTION_MISMATCH=y' make[2]: *** [__modpost] Error 1 This patch fixes that. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Acked-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Josh Boyer <jwboyer@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-09ub: fix up the conversion to sg_init_table()Pete Zaitcev
Signed-off-by: Pete Zaitcev <zaitcev@redhat.com> Cc: "Oliver Pinter" <oliver.pntr@gmail.com> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: Enhanced partition statistics: documentation update Enhanced partition statistics: remove old partition statistics Enhanced partition statistics: procfs Enhanced partition statistics: sysfs Enhanced partition statistics: aoe fix Enhanced partition statistics: update partition statitics Enhanced partition statistics: core statistics block: fixup rq_init() a bit Manually fixed conflict in drivers/block/aoe/aoecmd.c due to statistics support.
2008-02-08NBD: remove limit on max number of nbd devicesPaul Clements
Remove the arbitrary 128 device limit for NBD. nbds_max can now be set to any number. In certain scenarios where devices are used sparsely we have run into the 128 device limit. Signed-off-by: Paul Clements <paul.clements@steeleye.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08aoe: statically initialise devlist_lockAndrew Morton
I guess aoedev_init() can go away now. Cc: Greg KH <greg@kroah.com> Cc: "Ed L. Cashin" <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08aoe: update copyright dateEd L. Cashin
Update the year in the copyright notices. Signed-off-by: Ed L. Cashin <ecashin@coraid.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08aoe: make error messages more specificEd L. Cashin
Andrew Morton pointed out that the "too many targets" message in patch 2 could be printed for failing GFP_ATOMIC allocations. This patch makes the messages more specific. Signed-off-by: Ed L. Cashin <ecashin@coraid.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08aoe: the aoeminor doesn't need a long formatEd L. Cashin
The aoedev aoeminor member doesn't need a long format. Signed-off-by: Ed L. Cashin <ecashin@coraid.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08aoe: add module parameter for users who need more outstanding I/OEd L. Cashin
An AoE target provides an estimate of the number of outstanding commands that the AoE initiator can send before getting a response. The aoe_maxout parameter provides a way to set an even lower limit. It will not allow a user to use more outstanding commands than the target permits. If a user discovers a problem with a large setting, this parameter provides a way for us to work with them to debug the problem. We expect to improve the dynamic window sizing algorithm and drop this parameter. For the time being, it is a debugging aid. Signed-off-by: Ed L. Cashin <ecashin@coraid.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08aoe: only install new AoE device onceEd L. Cashin
An aoe driver user who had about 70 AoE targets found that he was hitting a BUG in sysfs_create_file because the aoe driver was trying to tell the kernel about an AoE device more than once. Each AoE device was reachable by several local network interfaces, and multiple ATA device indentify responses were returning from that single device. This patch eliminates a race condition so that aoe always informs the block layer of a new AoE device once in the presence of multiple incoming ATA device identify responses. Signed-off-by: Ed L. Cashin <ecashin@coraid.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08aoe: dynamically allocate a capped number of skbs when necessaryEd L. Cashin
What this Patch Does Even before this recent series of 12 patches to 2.6.22-rc4, the aoe driver was reusing a small set of skbs that were allocated once and were only used for outbound AoE commands. The network layer cannot be allowed to put_page on the data that is still associated with a bio we haven't returned to the block layer, so the aoe driver (even before the patch under discussion) is still the owner of skbs that have been handed to the network layer for transmission. We need to keep track of these skbs so that we can free them, but by tracking them, we can also easily re-use them. The new patch was a response to the behavior of certain network drivers. We cannot reuse an skb that the network driver still has in its transmit ring. Network drivers can defer transmit ring cleanup and then use the state in the skb to determine how many data segments to clean up in its transmit ring. The tg3 driver is one driver that behaves in this way. When the network driver defers cleanup of its transmit ring, the aoe driver can find itself in a situation where it would like to send an AoE command, and the AoE target is ready for more work, but the network driver still has all of the pre-allocated skbs. In that case, the new patch just calls alloc_skb, as you'd expect. We don't want to get carried away, though. We try not to do excessive allocation in the write path, so we cap the number of skbs we dynamically allocate. Probably calling it a "dynamic pool" is misleading. We were already trying to use a small fixed-size set of pre-allocated skbs before this patch, and this patch just provides a little headroom (with a ceiling, though) to accomodate network drivers that hang onto skbs, by allocating when needed. The d->skbpool_hd list of allocated skbs is necessary so that we can free them later. We didn't notice the need for this headroom until AoE targets got fast enough. Alternatives If the network layer never did a put_page on the pages in the bio's we get from the block layer, then it would be possible for us to hand skbs to the network layer and forget about them, allowing the network layer to free skbs itself (and thereby calling our own skb->destructor callback function if we needed that). In that case we could get rid of the pre-allocated skbs and also the d->skbpool_hd, instead just calling alloc_skb every time we wanted to transmit a packet. The slab allocator would effectively maintain the list of skbs. Besides a loss of CPU cache locality, the main concern with that approach the danger that it would increase the likelihood of deadlock when VM is trying to free pages by writing dirty data from the page cache through the aoe driver out to persistent storage on an AoE device. Right now we have a situation where we have pre-allocation that corresponds to how much we use, which seems ideal. Of course, there's still the separate issue of receiving the packets that tell us that a write has successfully completed on the AoE target. When memory is low and VM is using AoE to flush dirty data to free up pages, it would be perfect if there were a way for us to register a fast callback that could recognize write command completion responses. But I don't think the current problems with the receive side of the situation are a justification for exacerbating the problem on the transmit side. Signed-off-by: Ed L. Cashin <ecashin@coraid.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08aoe: user can ask driver to forget previously detected devicesEd L. Cashin
When an AoE device is detected, the kernel is informed, and a new block device is created. If the device is unused, the block device corresponding to remote device that is no longer available may be removed from the system by telling the aoe driver to "flush" its list of devices. Without this patch, software like GPFS and LVM may attempt to read from AoE devices that were discovered earlier but are no longer present, blocking until the I/O attempt times out. Signed-off-by: Ed L. Cashin <ecashin@coraid.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08aoe: eliminate goto and improve readabilityEd L. Cashin
Adam Richter suggested eliminating this goto. Signed-off-by: Ed L. Cashin <ecashin@coraid.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08aoe: mac_addr: avoid 64-bit arch compiler warningsEd L. Cashin
By returning unsigned long long, mac_addr does not generate compiler warnings on 64-bit architectures. Signed-off-by: Ed L. Cashin <ecashin@coraid.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08aoe: handle multiple network paths to AoE deviceEd L. Cashin
A remote AoE device is something can process ATA commands and is identified by an AoE shelf number and an AoE slot number. Such a device might have more than one network interface, and it might be reachable by more than one local network interface. This patch tracks the available network paths available to each AoE device, allowing them to be used more efficiently. Andrew Morton asked about the call to msleep_interruptible in the revalidate function. Yes, if a signal is pending, then msleep_interruptible will not return 0. That means we will not loop but will call aoenet_xmit with a NULL skb, which is a noop. If the system is too low on memory or the aoe driver is too low on frames, then the user can hit control-C to interrupt the attempt to do a revalidate. I have added a comment to the code summarizing that. Andrew Morton asked whether the allocation performed inside addtgt could use a more relaxed allocation like GFP_KERNEL, but addtgt is called when the aoedev lock has been locked with spin_lock_irqsave. It would be nice to allocate the memory under fewer restrictions, but targets are only added when the device is being discovered, and if the target can't be added right now, we can try again in a minute when then next AoE config query broadcast goes out. Andrew Morton pointed out that the "too many targets" message could be printed for failing GFP_ATOMIC allocations. The last patch in this series makes the messages more specific. Signed-off-by: Ed L. Cashin <ecashin@coraid.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08aoe: bring driver version number to 47Ed L. Cashin
Signed-off-by: Ed L. Cashin <ecashin@coraid.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08rd: support XIPNick Piggin
Support direct_access XIP method with brd. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08rewrite rdNick Piggin
This is a rewrite of the ramdisk block device driver. The old one is really difficult because it effectively implements a block device which serves data out of its own buffer cache. It relies on the dirty bit being set, to pin its backing store in cache, however there are non trivial paths which can clear the dirty bit (eg. try_to_free_buffers()), which had recently lead to data corruption. And in general it is completely wrong for a block device driver to do this. The new one is more like a regular block device driver. It has no idea about vm/vfs stuff. It's backing store is similar to the buffer cache (a simple radix-tree of pages), but it doesn't know anything about page cache (the pages in the radix tree are not pagecache pages). There is one slight downside -- direct block device access and filesystem metadata access goes through an extra copy and gets stored in RAM twice. However, this downside is only slight, because the real buffercache of the device is now reclaimable (because we're not playing crazy games with it), so under memory intensive situations, footprint should effectively be the same -- maybe even a slight advantage to the new driver because it can also reclaim buffer heads. The fact that it now goes through all the regular vm/fs paths makes it much more useful for testing, too. text data bss dec hex filename 2837 849 384 4070 fe6 drivers/block/rd.o 3528 371 12 3911 f47 drivers/block/brd.o Text is larger, but data and bss are smaller, making total size smaller. A few other nice things about it: - Similar structure and layout to the new loop device handlinag. - Dynamic ramdisk creation. - Runtime flexible buffer head size (because it is no longer part of the ramdisk code). - Boot / load time flexible ramdisk size, which could easily be extended to a per-ramdisk runtime changeable size (eg. with an ioctl). - Can use highmem for the backing store. [akpm@linux-foundation.org: fix build] [byron.bbradley@gmail.com: make rd_size non-static] Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Byron Bradley <byron.bbradley@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08Enhanced partition statistics: aoe fixJerome Marchand
Updates the enhanced partition statistics in ATA over Ethernet driver (not tested). Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2008-02-07Merge branch 'for-2.6.25' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'for-2.6.25' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (69 commits) [POWERPC] Add SPE registers to core dumps [POWERPC] Use regset code for compat PTRACE_*REGS* calls [POWERPC] Use generic compat_sys_ptrace [POWERPC] Use generic compat_ptrace_request [POWERPC] Use generic ptrace peekdata/pokedata [POWERPC] Use regset code for PTRACE_*REGS* requests [POWERPC] Switch to generic compat_binfmt_elf code [POWERPC] Switch to using user_regset-based core dumps [POWERPC] Add user_regset compat support [POWERPC] Add user_regset_view definitions [POWERPC] Use user_regset accessors for GPRs [POWERPC] ptrace accessors for special regs MSR and TRAP [POWERPC] Use user_regset accessors for SPE regs [POWERPC] Use user_regset accessors for altivec regs [POWERPC] Use user_regset accessors for FP regs [POWERPC] mpc52xx: fix compile error introduce when rebasing patch [POWERPC] 4xx: PCIe indirect DCR spinlock fix. [POWERPC] Add missing native dcr dcr_ind_lock spinlock [POWERPC] 4xx: Fix offset value on Warp board [POWERPC] 4xx: Add 440EPx Sequoia ehci dts entry ...
2008-02-06Merge branch 'virtex-for-2.6.25' of ↵Josh Boyer
git://git.secretlab.ca/git/linux-2.6-virtex into for-2.6.25
2008-02-06Atari floppy: Rename disk_type to atari_disk_typeGeert Uytterhoeven
Commit edfaa7c36574f1bf09c65ad602412db9da5f96bf Driver core: convert block from raw kobjects to core devices This moves the block devices to /sys/class/block. It will create a flat list of all block devices, with the disks and partitions in one directory. For compatibility /sys/block is created and contains symlinks to the disks. introduced a global disk_type variable in <linux/genhd.h>, causing the following compile error on Atari: drivers/block/ataflop.c:93: error: conflicting types for 'disk_type' include/linux/genhd.h:21: error: previous declaration of 'disk_type' was here Rename the local disk_type variable in drivers/block/ataflop.c to atari_disk_type, to avoid the conflict. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06cciss: use upper_32_bits() macro to eliminate warningsRandy Dunlap
Use upper_32_bits(x) macro to handle shifts that may be >= the width of the data type. drivers/block/cciss.c: In function 'do_cciss_request': drivers/block/cciss.c:2655: warning: right shift count >= width of type drivers/block/cciss.c:2656: warning: right shift count >= width of type drivers/block/cciss.c:2657: warning: right shift count >= width of type drivers/block/cciss.c:2658: warning: right shift count >= width of type Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: <mike.miller@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06rd: use is_power_of_2() in drivers/block/rd.c.Robert P. J. Day
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06Allow auto-destruction of loop devicesDavid Woodhouse
This allows a flag to be set on loop devices so that when they are closed for the last time, they'll self-destruct. In general, so that we can automatically allocate loop devices (as with losetup -f) and have them disappear when we're done with them. In particular, right now, so that we can stop relying on the hackish special-case in umount(8) which kills off loop devices which were set up by 'mount -oloop'. That means we can stop putting crap in /etc/mtab which doesn't belong there, which means it can be a symlink to /proc/mounts, which means yet another writable file on the root filesystem is eliminated and the 'stateless' folks get happier... and OLPC trac #356 can be closed. The mount(8) side of that is at http://marc.info/?l=util-linux-ng&m=119362955431694&w=2 [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: David Woodhouse <dwmw2@infradead.org> Cc: Bernardo Innocenti <bernie@codewiz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06fix ! versus & precedence in various placesAlexey Dobriyan
Fix various instances of if (!expr & mask) which should probably have been if (!(expr & mask)) Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Peter Osterlund <petero2@telia.com> Cc: Karsten Keil <kkeil@suse.de> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: "Antonino A. Daplas" <adaplas@pol.net> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Garzik <jeff@garzik.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06[POWERPC] Xilinx: Update compatible to use values generated by BSP generator.Stephen Neuendorffer
Mainly, this involves two changes: 1) xilinx->xlnx (recognized standard is to use the stock ticker) 2) In order to have the device tree focus on describing what the hardware is as exactly as possible, the compatible strings contain the full IP name and IP version. Signed-off-by: Stephen Neuendorffer <stephen.neuendorffer@xilinx.com> Acked-by: Peter Korsgaard <jacmet@sunsite.dk> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2008-02-06[POWERPC] Fix incorrectly tagged __devinitdata structuresGrant Likely
Fix compile errors in the xilinxfb, xsysace and uartlite drivers used by the Xilinx Virtex platform Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Peter Korsgaard <jacmet@sunsite.dk>
2008-02-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linusLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (25 commits) virtio: balloon driver virtio: Use PCI revision field to indicate virtio PCI ABI version virtio: PCI device virtio_blk: implement naming for vda-vdz,vdaa-vdzz,vdaaa-vdzzz virtio_blk: Dont waste major numbers virtio_blk: provide getgeo virtio_net: parametrize the napi_weight for virtio receive queue. virtio: free transmit skbs when notified, not on next xmit. virtio: flush buffers on open virtnet: remove double ether_setup virtio: Allow virtio to be modular and used by modules virtio: Use the sg_phys convenience function. virtio: Put the virtio under the virtualization menu virtio: handle interrupts after callbacks turned off virtio: reset function virtio: populate network rings in the probe routine, not open virtio: Tweak virtio_net defines virtio: Net header needs hdr_len virtio: remove unused id field from struct virtio_blk_outhdr virtio: clarify NO_NOTIFY flag usage ...
2008-02-04virtio_blk: implement naming for vda-vdz,vdaa-vdzz,vdaaa-vdzzzChristian Borntraeger
Am Freitag, 1. Februar 2008 schrieb Christian Borntraeger: > Right. I will fix that with an additional patch. This patch goes on top of the minor number patch. Please let me know if you want a merged patch: Currently virtio_blk creates the disk name combinging "vd" with 'a'++. This will give strange names after vdz. I have implemented names up to vdzzz - inspired by the sd.c code. That should be sufficient for now. There is one driver in the kernel (driver/s390/block/dasd_genhd.c) that implements names from dasda-dasdzzzz allowing even more disks. Maybe a janitor can come up with a common implementation usable for all kind of block device drivers. I have tested this patch with 100 disks - seems to work. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio_blk: Dont waste major numbersChristian Borntraeger
Rusty, currently virtio_blk uses one major number per device. While this works quite well on most systems it is wasteful and will exhaust major numbers on larger installations. This patch allocates a major number on init and will use 16 minor numbers for each disk. That will allow ~64k virtio_blk disks. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio_blk: provide getgeoChristian Borntraeger
Rusty, I currently try to make my guest boot from an virtio root device without having an external kernel. Some of the tools that I tried expect HDIO_GETGEO to work. The most interesting value is likely the geo.start value to get the offset of a partition. This value is filled by block/ioctl.c if fops->getgeo is set. This patch also fills in some standard values for heads, sectors and cylinders. Makes sense? Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio: Put the virtio under the virtualization menuAnthony Liguori
This patch moves virtio under the virtualization menu and changes virtio devices to not claim to only be for lguest. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio: reset functionRusty Russell
A reset function solves three problems: 1) It allows us to renegotiate features, eg. if we want to upgrade a guest driver without rebooting the guest. 2) It gives us a clean way of shutting down virtqueues: after a reset, we know that the buffers won't be used by the host, and 3) It helps the guest recover from messed-up drivers. So we remove the ->shutdown hook, and the only way we now remove feature bits is via reset. We leave it to the driver to do the reset before it deletes queues: the balloon driver, for example, needs to chat to the host in its remove function. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio: explicit enable_cb/disable_cb rather than callback return.Rusty Russell
It seems that virtio_net wants to disable callbacks (interrupts) before calling netif_rx_schedule(), so we can't use the return value to do so. Rename "restart" to "cb_enable" and introduce "cb_disable" hook: callback now returns void, rather than a boolean. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio: simplify config mechanism.Rusty Russell
Previously we used a type/len pair within the config space, but this seems overkill. We now simply define a structure which represents the layout in the config space: the config space can now only be extended at the end. The main driver-visible changes: 1) We indicate what fields are present with an explicit feature bit. 2) Virtqueues are explicitly numbered, and not in the config space. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-03drivers/block/: Spelling fixesJoe Perches
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Adrian Bunk <bunk@kernel.org>
2008-02-01USB: Remove unnecessary zeroing from ubPete Zaitcev
These zeroings were taken from usb-storage long time ago. I examined the submission paths and usb_fill_bulk_urb and found them unnecessary. Signed-off-by: Pete Zaitcev <zaitcev@yahoo.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-02-01block/sunvdc.c:print_version() must be __devinitAdrian Bunk
This patch fixes the following section mismatches: <-- snip --> ... WARNING: drivers/block/sunvdc.o(.text+0xf0): Section mismatch in reference from the function print_version() to the variable .devinit.data:version WARNING: drivers/block/sunvdc.o(.text+0xf8): Section mismatch in reference from the function print_version() to the variable .devinit.data:version ... <-- snip --> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-29cciss: fix bug in overriding ->data_len before completionJens Axboe
For BLOCK_PC requests, we need that length for completing the request. Andrew Vasquez <andrew.vasquez@qlogic.com> reported the following oops Hitting a consistent BUG() with recent Linus' linux-2.6.git: [ 12.941428] ------------[ cut here ]------------ [ 12.944874] kernel BUG at drivers/block/cciss.c:1260! [ 12.944874] invalid opcode: 0000 [1] SMP [ 12.944874] CPU 0 [ 12.944874] Modules linked in: [ 12.944874] Pid: 0, comm: swapper Not tainted 2.6.24 #43 [ 12.944874] RIP: 0010:[<ffffffff8039e43d>] [<ffffffff8039e43d>] cciss_softirq_done+0xbc/0x1bf [ 12.944874] RSP: 0018:ffffffff8063aed0 EFLAGS: 00010202 [ 12.944874] RAX: 0000000000000001 RBX: ffff8100cf800010 RCX: ffff81042f1253b0 [ 12.944874] RDX: ffff81042de398f0 RSI: ffff81042de398f0 RDI: 0000000000000001 [ 12.944874] RBP: ffff81042daa0000 R08: ffff81042f1253b0 R09: 0000000000000001 [ 12.944874] R10: 00000000000000fe R11: 0000000000000000 R12: 0000000000000002 [ 12.944874] R13: 0000000000000001 R14: ffff8100cf800000 R15: ffff81042de398f0 [ 12.944874] FS: 0000000000000000(0000) GS:ffffffff805bb000(0000) knlGS:0000000000000000 [ 12.944874] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b [ 12.944874] CR2: 00002afed7eea340 CR3: 000000042dbba000 CR4: 00000000000006e0 [ 12.944874] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 12.944874] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 12.944874] Process swapper (pid: 0, threadinfo ffffffff805f4000, task ffffffff805624a0) [ 12.944874] Stack: 0000000000000000 ffffffff8063af10 0000000000000001 ffffffff80632d60 [ 12.944874] 0000000000000000 000000000000000a ffffffff805bb900 ffffffff8032038f [ 12.944874] ffffffff8063af10 ffffffff8063af10 ffffffff805bb940 ffffffff802346b4 [ 12.944874] Call Trace: [ 12.944874] <IRQ> [<ffffffff8032038f>] blk_done_softirq+0x69/0x78 [ 12.944874] [<ffffffff802346b4>] __do_softirq+0x6f/0xd8 [ 12.944874] [<ffffffff8020c45c>] call_softirq+0x1c/0x30 [ 12.944874] [<ffffffff8020e347>] do_softirq+0x30/0x80 [ 12.944874] [<ffffffff8020e409>] do_IRQ+0x72/0xd9 [ 12.944874] [<ffffffff8020a50a>] mwait_idle+0x0/0x46 [ 12.944874] [<ffffffff8020a3da>] default_idle+0x0/0x3d [ 12.944874] [<ffffffff8020b7e1>] ret_from_intr+0x0/0xa [ 12.944874] <EOI> [<ffffffff8020a54c>] mwait_idle+0x42/0x46 [ 12.944874] [<ffffffff8020a481>] cpu_idle+0x6a/0xae [ 12.944874] [ 12.944874] [ 12.944874] Code: 0f 0b eb fe 48 8d 85 d8 c0 00 00 48 89 04 24 48 89 c7 e8 e5 [ 12.944874] RIP [<ffffffff8039e43d>] cciss_softirq_done+0xbc/0x1bf [ 12.944874] RSP <ffffffff8063aed0> [ 12.944903] ---[ end trace e9c631603f90d22f ]--- which is caused by blk_end_request() returning 'not done' for a request, since it gets asked to complete zero bytes. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-29xsysace: end request handling fixJens Axboe
In ace_fsm_dostate(), the variable 'i' was used only for passing sector size of the request to end_that_request_first(). So I removed it and changed the code to pass the size in bytes directly to __blk_end_request() Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (197 commits) sh: add spi header and r2d platform data V3 sh: update r7780rp interrupt code sh: remove consistent alloc stuff from the machine vector sh: use declared coherent memory for dreamcast pci ethernet adapter sh: declared coherent memory support V2 sh: Add support for SDK7780 board. sh: constify function pointer tables sh: Kill off -traditional for linker script. cdrom: Add support for Sega Dreamcast GD-ROM. sh: Kill off hs7751rvoip reference from arch/sh/Kconfig. sh: Drop r7780rp_defconfig, use r7780mp_defconfig as kbuild default. sh: Kill off dead HS771RVoIP board support. sh: r7785rp: Fix up DECLARE_INTC_DESC() arg mismatch. sh: r7785rp: Hook up the rest of the HL7785 FPGA IRQ vectors. sh: r2d - enable sm501 usb host function sh: remove voyagergx sh: r2d - add lcd planel timings to sm501 platform data sh: Add OHCI and UDC platform devices for SH7720. sh: intc - remove default interrupt priority tables sh: Correct pte size mismatch for X2 TLB. ...
2008-01-28blk_end_request: changing xsysace (take 4)Kiyoshi Ueda
This patch converts xsysace to use blk_end_request interfaces. Related 'uptodate' arguments are converted to 'error'. xsysace is a little bit different from "normal" drivers. xsysace driver has a state machine in it. It calls end_that_request_first() and end_that_request_last() from different states. (ACE_FSM_STATE_REQ_TRANSFER and ACE_FSM_STATE_REQ_COMPLETE, respectively.) However, those states are consecutive and without any interruption inbetween. So we can just follow the standard conversion rule (b) mentioned in the patch subject "[PATCH 01/30] blk_end_request: add new request completion interface". Cc: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28blk_end_request: changing ub (take 4)Kiyoshi Ueda
This patch converts ub to use blk_end_request interfaces. Related 'uptodate' arguments are converted to 'error'. Cc: Pete Zaitcev <zaitcev@redhat.com> Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28blk_end_request: changing cpqarray (take 4)Kiyoshi Ueda
This patch converts cpqarray to use blk_end_request interfaces. Related 'ok' arguments are converted to 'error'. cpqarray is a little bit different from "normal" drivers. cpqarray directly calls bio_endio() and disk_stat_add() when completing request. But those can be replaced with __end_that_request_first(). After the replacement, request completion procedures of those drivers become like the following: o end_that_request_first() o add_disk_randomness() o end_that_request_last() This can be converted to __blk_end_request() by following the rule (b) mentioned in the patch subject "[PATCH 01/30] blk_end_request: add new request completion interface". Cc: Mike Miller <mike.miller@hp.com> Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28blk_end_request: changing cciss (take 4)Kiyoshi Ueda
This patch converts cciss to use blk_end_request interfaces. Related 'uptodate' arguments are converted to 'error'. cciss is a little bit different from "normal" drivers. cciss directly calls bio_endio() and disk_stat_add() when completing request. But those can be replaced with __end_that_request_first(). After the replacement, request completion procedures of those drivers become like the following: o end_that_request_first() o add_disk_randomness() o end_that_request_last() This can be converted to blk_end_request() by following the rule (a) mentioned in the patch subject "[PATCH 01/30] blk_end_request: add new request completion interface". Cc: Mike Miller <mike.miller@hp.com> Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>