aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2006-06-17IB: IP address based RDMA connection managerSean Hefty
Kernel connection management agent over InfiniBand that connects based on IP addresses. The agent defines a generic RDMA connection abstraction to support clients wanting to connect over different RDMA devices. The agent also handles RDMA device hotplug events on behalf of clients. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17IB: address translation to map IP toIB addresses (GIDs)Sean Hefty
Add an address translation service that maps IP addresses to InfiniBand GID addresses using IPoIB. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17[NET]: Export ip_dev_find()Sean Hefty
Export ip_dev_find() to allow locating a net_device given an IP address. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17IB/cm: Match connection requests based on private dataSean Hefty
Extend matching connection requests to listens in the InfiniBand CM to include private data checks. This allows applications to listen on the same service identifier, with private data directing the request to the appropriate application. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17IB: common handling for marshalling parameters to/from userspaceSean Hefty
Provide common handling for marshalling data between userspace clients and kernel InfiniBand drivers. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17IB/mthca: memfree completion with error FW bug workaroundMichael S. Tsirkin
Memfree firmware is in rare cases reporting WQE index == base - 1 in receive completion with error, instead of (rq size - 1); base is 0 in mthca. Here is a patch to avoid kernel crash and report a correct WR id in this case. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17IB/mthca: restore missing PCI registers after resetMichael S. Tsirkin
mthca does not restore the following PCI-X/PCI Express registers after reset: PCI-X device: PCI-X command register PCI-X bridge: upstream and downstream split transaction registers PCI Express : PCI Express device control and link control registers This causes instability and/or bad performance on systems where one of these registers is set to a non-default value by BIOS. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17Linux v2.6.17Linus Torvalds
Being named "Crazed Snow-Weasel" instills a lot of confidence in this release, so I'm sure this will be one of the better ones.
2006-06-17[PATCH] powerpc: enable CPU_FTR_CI_LARGE_PAGE for cellArnd Bergmann
Reflect the fact that the Cell Broadband Engine supports 64k pages by adding the bit to the CPU features. Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-17[PATCH] powerpc: Fix 64k pages on non-partitioned machinesArnd Bergmann
The page size encoding passed to tlbie is incorrect for new-style large pages. This fixes it. This doesn't affect anything on older machines because mmu_psize_defs[psize].penc (the page size encoding) is 0 for 4k and 16M pages (the two are distinguished by a separate "is a large page" bit). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-17[PATCH] arm_timer: remove a racy and obsolete PF_EXITING checkOleg Nesterov
arm_timer() checks PF_EXITING to prevent BUG_ON(->exit_state) in run_posix_cpu_timers(). However, for some reason it does so only for CPUCLOCK_PERTHREAD case (which is imho wrong). Also, this check is not reliable, PF_EXITING could be set on another cpu without any locks/barriers just after the check, so it can't prevent from attaching the timer to the exiting task. The previous patch makes this check unneeded. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-17[PATCH] run_posix_cpu_timers: remove a bogus BUG_ON()Oleg Nesterov
do_exit() clears ->it_##clock##_expires, but nothing prevents another cpu to attach the timer to exiting process after that. arm_timer() tries to protect against this race, but the check is racy. After exit_notify() does 'write_unlock_irq(&tasklist_lock)' and before do_exit() calls 'schedule() local timer interrupt can find tsk->exit_state != 0. If that state was EXIT_DEAD (or another cpu does sys_wait4) interrupted task has ->signal == NULL. At this moment exiting task has no pending cpu timers, they were cleanuped in __exit_signal()->posix_cpu_timers_exit{,_group}(), so we can just return from irq. John Stultz recently confirmed this bug, see http://marc.theaimsgroup.com/?l=linux-kernel&m=115015841413687 Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-17[PATCH] check_process_timers: fix possible lockupOleg Nesterov
If the local timer interrupt happens just after do_exit() sets PF_EXITING (and before it clears ->it_xxx_expires) run_posix_cpu_timers() will call check_process_timers() with tasklist_lock + ->siglock held and check_process_timers: t = tsk; do { .... do { t = next_thread(t); } while (unlikely(t->flags & PF_EXITING)); } while (t != tsk); the outer loop will never stop. Actually, the window is bigger. Another process can attach the timer after ->it_xxx_expires was cleared (see the next commit) and the 'if (PF_EXITING)' check in arm_timer() is racy (see the one after that). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-17[PATCH] sky2: netconsole suspend/resume interactionStephen Hemminger
A couple of fixes that should prevent crashes when using netconsole and suspend/resume. First, netconsole poll routine shouldn't run unless the device is up; second, the NAPI poll should be disabled during suspend. This is only an issue on sky2, because it has to have one NAPI poll routine for both ports on dual port boards. Normal drivers use netif_rx_schedule_prep and that checks for netif_running. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-17[PATCH] Fix missing ret assignment in __bio_map_user() error pathJens Axboe
If get_user_pages() returns less pages than what we asked for, we jump to out_unmap which will return ERR_PTR(ret). But ret can contain a positive number just smaller than local_nr_pages, so be sure to set it to -EFAULT always. Problem found and diagnosed by Damien Le Moal <damien@sdl.hitachi.co.jp> Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-17[PATCH] fix cdrom openJens Axboe
Some time ago the cdrom open routine was changed so that we call the driver's open routine before checking to see if it is read only. However, if we discovered that a read write open was not possible and the open flags required a writable open, we just returned -EROFS without calling the driver's release routine. This seems to work for most cdrom drivers, but breaks the Powerpc iSeries virtual cdrom rather badly. This just inserts the release call in the error path to balance the call to "->open()" done by "open_for_data()". Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-14[PATCH] cfq-iosched: fix crash in do_div()Jens Axboe
We don't clear the seek stat values in cfq_alloc_io_context(), and if ->seek_mean is unlucky enough to be set to -36 by chance, the first invocation of cfq_update_io_seektime() will oops with a divide by zero in do_div(). Just memset the entire cic instead of filling invididual values independently. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-14[PATCH] Return error in case flock_lock_file failureKirill Korotaev
If flock_lock_file() failed to allocate flock with locks_alloc_lock() then "error = 0" is returned. Need to return some non-zero. Signed-off-by: Pavel Emelianov <xemul@openvz.org> Signed-off-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-13[PATCH] sky2: stop/start hardware idle timer on suspend/resumeStephen Hemminger
The resume bug was caused not by an early interrupt but because the idle timeout was not being stopped on suspend. Also disable hardware IRQ's on suspend. Will need to revisit this with hotplug? Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-13[PATCH] sky2: save/restore base hardware irq during suspend/resumeStephen Hemminger
The hardware should be fully shut off during suspend, and the base irq mask restored during resume. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-13[PATCH] sky2: fix hotplug detect during pollStephen Hemminger
If the poll routine detects no hardware available, it needs to dequeue it self from the network poll list. Linus didn't understand NAPI. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-13[PATCH] sky2: don't hard code number of portsStephen Hemminger
It is cleaner, to not loop over both ports if only one exists. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-13[PATCH] sky2: set_power_state should be voidStephen Hemminger
The set power state function is cleaner if it doesn't return anything. The only caller that could fail is in suspend() and it can check the argument there. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-12[PATCH] alpha: generic hweight build fixRandy Dunlap
From: Randy Dunlap <rdunlap@xenotime.net> According to include/asm-alpha/bitops.h, only ALPHA_EV67 has hardware hweight support, so ALPHA_EV6 needs to use GENERIC_HWEIGHT. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Ernst Herzberg <earny@net4u.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-12[PATCH] tmpfs: Decrement i_nlink correctly in shmem_rmdir()Sergey Vlasov
shmem_rmdir() must undo the increment of i_nlink done in shmem_get_inode() for directories, otherwise at least IN_DELETE_SELF inotify event generation is broken. Signed-off-by: Sergey Vlasov <vsu@altlinux.ru> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-12[PATCH] tmpfs: time granularity fix for [acm]time going backwardsRobin H. Johnson
I noticed a strange behavior in a tmpfs file system the other day, while building packages - occasionally, and seemingly at random, make decided to rebuild a target. However, only on tmpfs. A file would be created, and if checked, it had a sub-second timestamp. However, after an utimes related call where sub-seconds should be set, they were zeroed instead. In the case that a file was created, and utimes(...,NULL) was used on it in the same second, the timestamp on the file moved backwards. After some digging, I found that this was being caused by tmpfs not having a time granularity set, thus inheriting the default 1 second granularity. Hugh adds: yes, we missed tmpfs when the s_time_gran mods went into 2.6.11. Unfortunately, the granularity of CURRENT_TIME, often used in filesystems, does not match the default granularity set by alloc_super. A few more such discrepancies have been found, but this is the most important to fix now. Signed-off-by: Robin H. Johnson <robbat2@gentoo.org> Acked-by: Andi Kleen <ak@suse.de> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-12Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC64]: Do not double-export sys_close() when CONFIG_SOLARIS_EMUL_MODULE
2006-06-12Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [IPV4]: Increment ipInHdrErrors when TTL expires. [TCP]: continued: reno sacked_out count fix [DCCP] Ackvec: fix soft lockup in ackvec handling code
2006-06-12Merge master.kernel.org:/home/rmk/linux-2.6-armLinus Torvalds
* master.kernel.org:/home/rmk/linux-2.6-arm: [ARM] Fix Integrator and Versatile interrupt initialisation [ARM] 3546/1: PATCH: subtle lost interrupts bug on i.MX [ARM] 3547/1: PXA-OHCI: Allow platforms to specify a power budget [ARM] Fix Neponset IRQ handling
2006-06-12[IPV4]: Increment ipInHdrErrors when TTL expires.Weidong
Signed-off-by: Weidong <weid@nanjing-fnst.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-12[sky2] Fix sky2 network driver suspend/resumeLinus Torvalds
This fixes two independent problems: it would not save the PCI state on suspend (and thus try to resume a nonexistent state on resume), and while shut off, if an interrupt happened on the same shared irq, the irq handler would react very badly to the interrupt status being an invalid all-ones state. Acked-by: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-12Merge branch 'upstream-linus' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: [PATCH] sata_mv: grab host lock inside eng_timeout
2006-06-11[TCP]: continued: reno sacked_out count fixAki M Nyrhinen
From: Aki M Nyrhinen <anyrhine@cs.helsinki.fi> IMHO the current fix to the problem (in_flight underflow in reno) is incorrect. it treats the symptons but ignores the problem. the problem is timing out packets other than the head packet when we don't have sack. i try to explain (sorry if explaining the obvious). with sack, scanning the retransmit queue for timed out packets is fine because we know which packets in our retransmit queue have been acked by the receiver. without sack, we know only how many packets in our retransmit queue the receiver has acknowledged, but no idea which packets. think of a "typical" slow-start overshoot case, where for example every third packet in a window get lost because a router buffer gets full. with sack, we check for timeouts on those every third packet (as the rest have been sacked). the packet counting works out and if there is no reordering, we'll retransmit exactly the packets that were lost. without sack, however, we check for timeout on every packet and end up retransmitting consecutive packets in the retransmit queue. in our slow-start example, 2/3 of those retransmissions are unnecessary. these unnecessary retransmissions eat the congestion window and evetually prevent fast recovery from continuing, if enough packets were lost. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-11[DCCP] Ackvec: fix soft lockup in ackvec handling codeAndrea Bittau
A soft lockup existed in the handling of ack vector records. Specifically, when a tail of the list of ack vector records was removed, it was possible to end up iterating infinitely on an element of the tail. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-11[SPARC64]: Do not double-export sys_close() when CONFIG_SOLARIS_EMUL_MODULEDavid S. Miller
It is already exported by fs/open.c Noticed by Ben Collins. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-11[PATCH] Fix for the PPTP hangs that have been reportedPaul Mackerras
People have been reporting that PPP connections over ptys, such as used with PPTP, will hang randomly when transferring large amounts of data, for instance in http://bugzilla.kernel.org/show_bug.cgi?id=6530. I have managed to reproduce the problem, and the patch below fixes the actual cause. The problem is not in fact in ppp_async.c but in n_tty.c. What happens is that when pptp reads from the pty, we call read_chan() in drivers/char/n_tty.c on the master side of the pty. That copies all the characters out of its buffer to userspace and then calls check_unthrottle(), which calls the pty unthrottle routine, which calls tty_wakeup on the slave side, which calls ppp_asynctty_wakeup, which calls tasklet_schedule. So far so good. Since we are in process context, the tasklet runs immediately and calls ppp_async_process(), which calls ppp_async_push, which calls the tty->driver->write function to send some more output. However, tty->driver->write() returns zero, because the master tty->receive_room is still zero. We haven't returned from check_unthrottle() yet, and read_chan() only updates tty->receive_room _after_ calling check_unthrottle. That means that the driver->write call in ppp_async_process() returns 0. That would be fine if we were going to get a subsequent wakeup call, but we aren't (we just had it, and the buffer is now empty). The solution is for n_tty.c to update tty->receive_room _before_ calling the driver unthrottle routine. The patch below does this. With this patch I was able to transfer a 900MB file over a PPTP connection (taking about 25 minutes), whereas without the patch the connection would always stall in under a minute. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-11[PATCH] sata_mv: grab host lock inside eng_timeoutMark Lord
Bug fix: mv_eng_timeout() calls mv_err_intr() without first grabbing the host lock, which can lead to all sorts of interesting scenarios. This whole error-handling portion of sata_mv is nasty (and will get fixed for the new EH stuff), but for now this patch will help keep it on life-support. Signed-off-by: Mark Lord <liml@rtr.ca> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-06-11Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: [PATCH] PCI: reverse pci config space restore order [PATCH] PCI: Improve PCI config space writeback [PATCH] PCI: Error handling on PCI device resume [PATCH] PCI: fix pciehp compile issue when CONFIG_ACPI is not enabled
2006-06-11[PATCH] typo in vmscan.cChristoph Lameter
From: Christoph Lameter <clameter@sgi.com> Looks like a comma was left from the conversion from a struct to an assignment. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-11[PATCH] PCI: reverse pci config space restore orderYu, Luming
According to Intel ICH spec, there are several rules that Base Address should be programmed before IOSE (PCICMD register ) enabled. For example ICH7: 12.1.3 SATA : the base address register for the bus master register should be programmed before this bit is set. 11.1.3: PCICMD (USB): The base address register for USB should be programmed before this bit is set. .... To make sure kernel code follow this rule , and prevent unnecessary confusion. I proposal this patch. Signed-off-by: Luming Yu <luming.yu@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-11[PATCH] PCI: Improve PCI config space writebackDave Jones
At least one laptop blew up on resume from suspend with a black screen due to a lack of this patch. By only writing back config space that is different, we minimise the possibility of accidents like this. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-11[PATCH] PCI: Error handling on PCI device resumeJean Delvare
We currently don't handle errors properly when resuming a PCI device: * In pci_default_resume() we capture the error code returned by pci_enable_device() but don't pass it up to the caller. Introduced by commit 95a629657dbe28e44a312c47815b3dc3f1ce0970 * In pci_resume_device(), the errors possibly returned by the driver's .resume method or by the generic pci_default_resume() function are ignored. This patch fixes both issues. Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-11[PATCH] PCI: fix pciehp compile issue when CONFIG_ACPI is not enabledakpm@osdl.org
Fix build error when CONFIG_ACPI not defined Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-10[SPARC]: Migration cost tune up in sparc smp.Krzysztof Helt
This patch sets the max_cache_size value required to tune up scheduler in SMP systems. Otherwise, the calculated migration_cost is too high and task scheduling may lock up. Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-10[SPARC64]: Set appropriate max_cache_size.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-10Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC64]: Avoid JBUS errors on some Niagara systems. [FUSION]: Fix mptspi.c build with CONFIG_PM not set. [TG3]: Handle Sun onboard tg3 chips more correctly. [SPARC64]: Dump local cpu registers in sun4v_log_error()
2006-06-10[PATCH] powerpc: console_initcall ordering issuesMilton Miller
From: Milton Miller <miltonm@bga.com> The add_preferred_console call in rtas_console.c was not causing the console to be selected. It turns out that the add_preferred_console was being called after the hvc_console driver was registered. It only works when it is called before the console driver is registered. Reorder hvc_console.o after the hvc_console drivers to allow the selection during console_initcall processing. Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Anton Blanchard <anton@samba.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-10[PATCH] I2O: Bugfixes to get I2O working againMarkus Lidel
From: Markus Lidel <Markus.Lidel@shadowconnect.com> - Fixed locking of struct i2o_exec_wait in Executive-OSM - Removed LCT Notify in i2o_exec_probe() which caused freeing memory and accessing freed memory during first enumeration of I2O devices - Added missing locking in i2o_exec_lct_notify() - removed put_device() of I2O controller in i2o_iop_remove() which caused the controller structure get freed to early - Fixed size of mempool in i2o_iop_alloc() - Fixed access to freed memory in i2o_msg_get() See http://bugzilla.kernel.org/show_bug.cgi?id=6561 Signed-off-by: Markus Lidel <Markus.Lidel@shadowconnect.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-10[PATCH] powernow-k8 crash workaroundAndrew Morton
From: Andrew Morton <akpm@osdl.org> Work around the oops reported in http://bugzilla.kernel.org/show_bug.cgi?id=6478. Thanks to Ralf Hildebrandt <ralf.hildebrandt@charite.de> for testing and reporting. Acked-by: Dave Jones <davej@codemonkey.org.uk> Cc: "Brown, Len" <len.brown@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-10[PATCH] Further alterations for memory barrier documentDavid Howells
From: David Howells <dhowells@redhat.com> Apply some alterations to the memory barrier document that I worked out with Paul McKenney of IBM, plus some of the alterations suggested by Alan Stern. The following changes were made: (*) One of the examples given for what can happen with overlapping memory barriers was wrong. (*) The description of general memory barriers said that a general barrier is a combination of a read barrier and a write barrier. This isn't entirely true: it implies both, but is more than a combination of both. (*) The first example in the "SMP Barrier Pairing" section was wrong: the loads around the read barrier need to touch the memory locations in the opposite order to the stores around the write barrier. (*) Added a note to make explicit that the loads should be in reverse order to the stores. (*) Adjusted the diagrams in the "Examples Of Memory Barrier Sequences" section to make them clearer. Added a couple of diagrams to make it more clear as to how it could go wrong without the barrier. (*) Added a section on memory speculation. (*) Dropped any references to memory allocation routines doing memory barriers. They may do sometimes, but it can't be relied on. This may be worthy of further documentation later. (*) Made the fact that a LOCK followed by an UNLOCK should not be considered a full memory barrier more explicit and gave an example. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>