aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2008-07-03Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linusLinus Torvalds
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: [MIPS] cevt-txx9: Reset timer counter on initialization [MIPS] IP22: Fix crashes due to wrong L1_CACHE_BYTES [MIPS] IP32: Fix unexpected irq 71
2008-07-03hrtimer: prevent migration for raising softirqSteven Rostedt
Due to a possible deadlock, the waking of the softirq was pushed outside of the hrtimer base locks. See commit 0c96c5979a522c3323c30a078a70120e29b5bdbc Unfortunately this allows the task to migrate after setting up the softirq and raising it. Since softirqs run a queue that is per-cpu we may raise the softirq on the wrong CPU and this will keep the queued softirq task from running. To solve this issue, this patch disables preemption around the releasing of the hrtimer lock and raising of the softirq. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-03[MIPS] cevt-txx9: Reset timer counter on initializationAtsushi Nemoto
The txx9_tmr_init() will not clear a timer counter register in a certain case. The counter register is cleared on 1->0 transition of TCE bit if CRE=1. So just clearing the TCE bit is not enough. Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-03[MIPS] IP22: Fix crashes due to wrong L1_CACHE_BYTESThomas Bogendoerfer
The introduction of a real dma cache invalidate makes it important to have a correct cache line size, otherwise the kernel will gives out two memory segment, which might share one cache line. The R4400 Indy/Indigo2 CPU modules are using a second level cache line size of 128 bytes, so MIPS_L1_CACHE_SHIFT needs to be bumped up to 7 for IP22. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-03[MIPS] IP32: Fix unexpected irq 71Thomas Bogendoerfer
It's possible that the crime interrupt handler is called without pending interrupts (probably a hardware issue). To avoid irritating "unexpected irq 71" messages, we now just ignore the spurious crime interrupts. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-03Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: 9p: fix O_APPEND in legacy mode
2008-07-03Do not overwrite nr_zones on !NUMA when initialising zlcache_ptrMel Gorman
The non-NUMA case of build_zonelist_cache() would initialize the zlcache_ptr for both node_zonelists[] to NULL. Which is problematic, since non-NUMA only has a single node_zonelists[] entry, and trying to zero the non-existent second one just overwrote the nr_zones field instead. As kswapd uses this value to determine what reclaim work is necessary, the result is that kswapd never reclaims. This causes processes to stall frequently in low-memory situations as they always direct reclaim. This patch initialises zlcache_ptr correctly. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Tested-by: Dan Williams <dan.j.williams@intel.com> [ Simplified patch a bit ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-039p: fix O_APPEND in legacy modeEric Van Hensbergen
The legacy protocol's open operation doesn't handle an append operation (it is expected that the client take care of it). We were incorrectly passing the extended protocol's flag through even in legacy mode. This was reported in bugzilla report #10689. This patch fixes the problem by disallowing extended protocol open modes from being passed in legacy mode and implemented append functionality on the client side by adding a seek after the open. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-07-02Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: acpiphp: cleanup notify handler on all root bridges PCI: Limit VPD read/write lengths for Broadcom 5706, 5708, 5709 rev. PCI: Restrict VPD read permission to root
2008-07-02Merge branch 'i2c-fix' of git://aeryn.fluff.org.uk/bjdooks/linuxLinus Torvalds
* 'i2c-fix' of git://aeryn.fluff.org.uk/bjdooks/linux: I2C: S3C2410: Add MODULE_ALIAS() for s3c2440 device. I2C: S3C2410: Fixup error codes returned rom a transfer. I2C: S3C2410: Check ACK on byte transmission
2008-07-02Merge branch 'for-2.6.26' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds
* 'for-2.6.26' of git://git.kernel.dk/linux-2.6-block: Properly notify block layer of sync writes block: Fix the starving writes bug in the anticipatory IO scheduler
2008-07-02Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] export account_system_vtime [IA64] Bugfix for system with 32 cpus
2008-07-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvbLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb: V4L/DVB (8178): uvc: Fix compilation breakage for the other drivers, if uvc is selected V4L/DVB (8145a): USB Video Class driver
2008-07-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: ide: fix /proc/ide/ide?/mate reporting Revert "BAST: Remove old IDE driver"
2008-07-02Merge master.kernel.org:/home/rmk/linux-2.6-armLinus Torvalds
* master.kernel.org:/home/rmk/linux-2.6-arm: [ARM] 5131/1: Annotate platform_secondary_init with trace_hardirqs_off [ARM] 5117/1: pxafb: fix __devinit/exit annotations [ARM] Export dma_sync_sg_for_device() [ARM] 5109/1: Mark rtc sa1100 driver as wakeup source before registering it [ARM] 5116/1: pxafb: cleanup and fix order of failure handling [ARM] 5115/1: pxafb: fix ifdef for command line option handling ARM: OMAP: Correcting the gpmc prefetch control register address ARM: OMAP: DMA: Don't mark channel active in omap_enable_channel_irq
2008-07-02tty: Fix inverted logic in send_breakAlan Cox
Not sure how this came to get inverted but it appears to have been my mess up. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-02Merge branch 'sched-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: fix divide error when trying to configure rt_period to zero
2008-07-02Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6Linus Torvalds
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6: i2c: Fix bad hint about irqs in i2c.h i2c: Documentation: fix device matching description
2008-07-02Merge branch 'core-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: rcu: fix hotplug vs rcu race
2008-07-02Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: fix NODES_SHIFT Kconfig range
2008-07-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: [SCSI] esp: tidy up target reference counting [SCSI] esp: Fix OOPS in esp_reset_cleanup(). [SCSI] ses: Fix timeout
2008-07-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dmLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: dm crypt: use cond_resched
2008-07-02Merge branch 'for-2.6.26' of git://neil.brown.name/mdLinus Torvalds
* 'for-2.6.26' of git://neil.brown.name/md: Fix error paths if md_probe fails. Don't acknowlege that stripe-expand is complete until it really is. Ensure interrupted recovery completed properly (v1 metadata plus bitmap)
2008-07-02Merge branch 'merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: powerpc/mpc5200: Fix lite5200b suspend/resume powerpc/legacy_serial: Bail if reg-offset/shift properties are present powerpc/bootwrapper: update for initrd with simpleImage
2008-07-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (55 commits) net: fib_rules: fix error code for unsupported families netdevice: Fix wrong string handle in kernel command line parsing net: Tyop of sk_filter() comment netlink: Unneeded local variable net-sched: fix filter destruction in atm/hfsc qdisc destruction net-sched: change tcf_destroy_chain() to clear start of filter list ipv4: fix sysctl documentation of time related values mac80211: don't accept WEP keys other than WEP40 and WEP104 hostap: fix sparse warnings hostap: don't report useless WDS frames by default textsearch: fix Boyer-Moore text search bug netfilter: nf_conntrack_tcp: fixing to check the lower bound of valid ACK ipv6 route: Convert rt6_device_match() to use RT6_LOOKUP_F_xxx flags. netlabel: Fix a problem when dumping the default IPv6 static labels net/inet_lro: remove setting skb->ip_summed when not LRO-able inet fragments: fix race between inet_frag_find and inet_frag_secret_rebuild CONNECTOR: add a proc entry to list connectors netlink: Fix some doc comments in net/netlink/attr.c tcp: /proc/net/tcp rto,ato values not scaled properly (v2) include/linux/netdevice.h: don't export MAX_HEADER to userspace ...
2008-07-02DRM/i915: only use tiled blits on 965+Jesse Barnes
When scheduled swaps occur, we need to blit between front & back buffers. If the buffers are tiled, we need to set the appropriate XY_SRC_COPY tile bit, but only on 965 chips, since it will cause corruption on pre-965 (e.g. 945). Bug reported by and fix tested by Tomas Janousek <tomi@nomi.cz>. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Acked-by: Dave Airlie <airlied@linux.ie> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-02drivers/input/ff-core.c needs <linux/sched.h>Geert Uytterhoeven
Commit 656acd2bbc4ce7f224de499ee255698701396c48 ("Input: fix locking in force-feedback core") causes the following regression on m68k: | linux/drivers/input/ff-core.c: In function 'input_ff_upload': | linux/drivers/input/ff-core.c:172: error: dereferencing pointer to incomplete type | linux/drivers/input/ff-core.c: In function 'erase_effect': | linux/drivers/input/ff-core.c:197: error: dereferencing pointer to incomplete type | linux/drivers/input/ff-core.c:204: error: dereferencing pointer to incomplete type | make[4]: *** [drivers/input/ff-core.o] Error 1 As the incomplete type is `struct task_struct', including <linux/sched.h> fixes it. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-03Merge branch 'for-2.6.26' of git://git.secretlab.ca/git/linux-2.6-mpc52xx ↵Paul Mackerras
into merge
2008-07-02PCI: acpiphp: cleanup notify handler on all root bridgesAlex Chiang
During the development of the physical PCI slot patch series, Gary Hade kept on reporting strange oopses due to interactions between pci_slot and acpiphp. http://lkml.org/lkml/2007/11/28/319 find_root_bridges() unconditionally installs handle_hotplug_event_bridge() as an ACPI_SYSTEM_NOTIFY handler for all root bridges. However, during module cleanup, remove_bridge() will only remove the notify handler iff the root bridge had a hot-pluggable slot directly underneath. That is: root bridge -> hotplug slot But, if the topology looks like either of the following: root bridge -> non-hotplug slot root bridge -> p2p bridge -> hotplug slot Then we currently do not remove the notify handler from that root bridge. This can cause a kernel oops if we modprobe acpiphp later and it gets loaded somewhere else in memory. If the root bridge then receives a hotplug event, it will then attempt to call a stale, non-existent notify handler and we blow up. Much thanks goes to Gary Hade for his persistent debugging efforts. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Gary Hade <garyhade@us.ibm.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-07-02PCI: Limit VPD read/write lengths for Broadcom 5706, 5708, 5709 rev.Benjamin Li
For Broadcom 5706, 5708, 5709 rev. A nics, any read beyond the VPD end tag will hang the device. This problem was initially observed when a vpd entry was created in sysfs ('/sys/bus/pci/devices/<id>/vpd'). A read to this sysfs entry will dump 32k of data. Reading a full 32k will cause an access beyond the VPD end tag causing the device to hang. Once the device is hung, the bnx2 driver will not be able to reset the device. We believe that it is legal to read beyond the end tag and therefore the solution is to limit the read/write length. A majority of this patch is from Matthew Wilcox who gave code for reworking the PCI vpd size information. A PCI quirk added for the Broadcom NIC's to limit the read/write's. Signed-off-by: Benjamin Li <benli@broadcom.com> Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-07-02V4L/DVB (8178): uvc: Fix compilation breakage for the other drivers, if uvc ↵Mauro Carvalho Chehab
is selected UVC makefile defines obj as: obj-$(CONFIG_USB_VIDEO_CLASS) := uvcvideo.o Instead of: obj-$(CONFIG_USB_VIDEO_CLASS) += uvcvideo.o Due to that, if uvc is selected, all obj-y or obj-m that were added to compilation were forget. This breaks a proper kernel build. Acked-by: Laurent Pinchart <laurent.pinchart@skynet.be> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-02dm crypt: use cond_reschedMilan Broz
Add cond_resched() to prevent monopolising CPU when processing large bios. dm-crypt processes encryption of bios in sector units. If the bio request is big it can spend a long time in the encryption call. Signed-off-by: Milan Broz <mbroz@redhat.com> Tested-by: Yan Li <elliot.li.tech@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2008-07-01net: fib_rules: fix error code for unsupported familiesPatrick McHardy
The errno code returned must be negative. Fixes "RTNETLINK answers: Unknown error 18446744073709551519". Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-01netdevice: Fix wrong string handle in kernel command line parsingWang Chen
v1->v2: Use strlcpy() to ensure s[i].name be null-termination. 1. In netdev_boot_setup_add(), a long name will leak. ex. : dev=21,0x1234,0x1234,0x2345,eth123456789verylongname......... 2. In netdev_boot_setup_check(), mismatch will happen if s[i].name is a substring of dev->name. ex. : dev=...eth1 dev=...eth11 [ With feedback from Ben Hutchings. ] Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-01net: Tyop of sk_filter() commentWang Chen
Parameter "needlock" no long exists. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-01netlink: Unneeded local variableWang Chen
We already have a variable, which has the same capability. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-01net-sched: fix filter destruction in atm/hfsc qdisc destructionPatrick McHardy
Filters need to be destroyed before beginning to destroy classes since the destination class needs to still be alive to unbind the filter. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-01net-sched: change tcf_destroy_chain() to clear start of filter listPatrick McHardy
Pass double tcf_proto pointers to tcf_destroy_chain() to make it clear the start of the filter list for more consistency. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-01ipv4: fix sysctl documentation of time related valuesStephen Hemminger
These sysctl values are time related and all use the same routine (proc_dointvec_jiffies) that internally converts from seconds to jiffies. The code is fine, the documentation is just wrong. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-01powerpc/mpc5200: Fix lite5200b suspend/resumeTim Yamin
Suspend/resume ("echo mem > /sys/power/state") does not work with vanilla kernels -- the system does not suspend correctly and just hangs. This patch fixes this so suspend/resume works: 1) of_iomap does not map the whole 0xC000 of the MPC5200 immr so saving registers does not work. 2) PCI registers need to be saved and restored. Signed-off-by: Tim Yamin <plasm@roo.me.uk> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2008-07-01powerpc/legacy_serial: Bail if reg-offset/shift properties are presentJohn Linn
The legacy serial driver does not work with an 8250 type UART that is described in the device tree with the reg-offset and reg-shift properties. This change makes legacy_serial ignore these devices. Signed-off-by: John Linn <john.linn@xilinx.com> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2008-07-01i2c: Fix bad hint about irqs in i2c.hWolfram Sang
i2c.h mentions -1 as a not-issued irq. This false hint was taken by of_i2c and caused crashes. Don't give any advice as 'no irq' is not consistent across all architectures yet and it is not needed internally by the i2c-core. Signed-off-by: Wolfram Sang <w.sang@pengutronix.de> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-07-01i2c: Documentation: fix device matching descriptionBen Dooks
The matching process described for new style clients in Documentation/i2c/writing-clients is classed as out-of-date as it requires the presence of an .id_table entry in the driver's i2c_driver entry. Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-07-01powerpc/bootwrapper: update for initrd with simpleImageJohn Linn
This change to the makefile corrects the build of a simpleImage with initrd. Signed-off-by: John Linn <john.linn@xilinx> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2008-07-01PCI: Restrict VPD read permission to rootBen Hutchings
Some PCI devices will lock up if we attempt to read from VPD addresses beyond some device-dependent limit. Until we can identify these devices and adjust the file size accordingly, only let root read VPD through sysfs to prevent a DoS by normal users. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-07-01I2C: S3C2410: Add MODULE_ALIAS() for s3c2440 device.Ben Dooks
Add a MODULE_ALIAS() statement for the i2c-s3c2410 controller to ensure that it can be autoloaded on the S3C2440 systems that we support. Signed-off-by: Ben Dooks <ben-linux@fluff.org>
2008-07-01I2C: S3C2410: Fixup error codes returned rom a transfer.Ben Dooks
The driver should be returning -ENXIO for transfers that do not pass the initial address byte stage. Note, also small tidyups to the driver comments in the area. Signed-off-by: Ben Dooks <ben-linux@fluff.org>
2008-07-01I2C: S3C2410: Check ACK on byte transmissionBen Dooks
We should check for the reception of an ACK after transmitting each data byte. The address send has been correctly checking this, but the data write byte state should have also been checking for these failures. As part of the same fix, we remove the ACK checking from the receive path where it should not have been checking for an ACK which our hardware was sending. Signed-off-by: Ben Dooks <ben-linux@fluff.org>
2008-07-01rcu: fix hotplug vs rcu raceGautham R Shenoy
Dhaval Giani reported this warning during cpu hotplug stress-tests: | On running kernel compiles in parallel with cpu hotplug: | | WARNING: at arch/x86/kernel/smp.c:118 | native_smp_send_reschedule+0x21/0x36() | Modules linked in: | Pid: 27483, comm: cc1 Not tainted 2.6.26-rc7 #1 | [...] | [<c0110355>] native_smp_send_reschedule+0x21/0x36 | [<c014fe8f>] force_quiescent_state+0x47/0x57 | [<c014fef0>] call_rcu+0x51/0x6d | [<c01713b3>] __fput+0x130/0x158 | [<c0171231>] fput+0x17/0x19 | [<c016fd99>] filp_close+0x4d/0x57 | [<c016fdff>] sys_close+0x5c/0x97 IMHO the warning is a spurious one. cpu_online_map is updated by the _cpu_down() using stop_machine_run(). Since force_quiescent_state is invoked from irqs disabled section, stop_machine_run() won't be executing while a cpu is executing force_quiescent_state(). Hence the cpu_online_map is stable while we're in the irq disabled section. However, a cpu might have been offlined _just_ before we disabled irqs while entering force_quiescent_state(). And rcu subsystem might not yet have handled the CPU_DEAD notification, leading to the offlined cpu's bit being set in the rcp->cpumask. Hence cpumask = (rcp->cpumask & cpu_online_map) to prevent sending smp_reschedule() to an offlined CPU. Here's the timeline: CPU_A CPU_B -------------------------------------------------------------- cpu_down(): . . . . . stop_machine(): /* disables preemption, . * and irqs */ . . . . . take_cpu_down(); . . . . . . . cpu_disable(); /*this removes cpu . *from cpu_online_map . */ . . . . . restart_machine(); /* enables irqs */ . ------WINDOW DURING WHICH rcp->cpumask is stale --------------- . call_rcu(); . /* disables irqs here */ . .force_quiescent_state(); .CPU_DEAD: .for_each_cpu(rcp->cpumask) . . smp_send_reschedule(); . . . . WARN_ON() for offlined CPU! . . . rcu_cpu_notify: . -------- WINDOW ENDS ------------------------------------------ rcu_offline_cpu() /* Which calls cpu_quiet() * which removes * cpu from rcp->cpumask. */ If a new batch was started just before calling stop_machine_run(), the "tobe-offlined" cpu is still present in rcp-cpumask. During a cpu-offline, from take_cpu_down(), we queue an rt-prio idle task as the next task to be picked by the scheduler. We also call cpu_disable() which will disable any further interrupts and remove the cpu's bit from the cpu_online_map. Once the stop_machine_run() successfully calls take_cpu_down(), it calls schedule(). That's the last time a schedule is called on the offlined cpu, and hence the last time when rdp->passed_quiesc will be set to 1 through rcu_qsctr_inc(). But the cpu_quiet() will be on this cpu will be called only when the next RCU_SOFTIRQ occurs on this CPU. So at this time, the offlined CPU is still set in rcp->cpumask. Now coming back to the idle_task which truely offlines the CPU, it does check for a pending RCU and raises the softirq, since it will find rdp->passed_quiesc to be 0 in this case. However, since the cpu is offline I am not sure if the softirq will trigger on the CPU. Even if it doesn't the rcu_offline_cpu() will find that rcp->completed is not the same as rcp->cur, which means that our cpu could be holding up the grace period progression. Hence we call cpu_quiet() and move ahead. But because of the window explained in the timeline, we could still have a call_rcu() before the RCU subsystem executes it's CPU_DEAD notification, and we send smp_send_reschedule() to offlined cpu while trying to force the quiescent states. The appended patch adds comments and prevents checking for offlined cpu everytime. cpu_online_map is updated by the _cpu_down() using stop_machine_run(). Since force_quiescent_state is invoked from irqs disabled section, stop_machine_run() won't be executing while a cpu is executing force_quiescent_state(). Hence the cpu_online_map is stable while we're in the irq disabled section. Reported-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Acked-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: laijs@cn.fujitsu.com Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Rusty Russel <rusty@rustcorp.com.au> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-01Properly notify block layer of sync writesJens Axboe
fsync_buffers_list() and sync_dirty_buffer() both issue async writes and then immediately wait on them. Conceptually, that makes them sync writes and we should treat them as such so that the IO schedulers can handle them appropriately. This patch fixes a write starvation issue that Lin Ming reported, where xx is stuck for more than 2 minutes because of a large number of synchronous IO in the system: INFO: task kjournald:20558 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kjournald D ffff810010820978 6712 20558 2 ffff81022ddb1d10 0000000000000046 ffff81022e7baa10 ffffffff803ba6f2 ffff81022ecd0000 ffff8101e6dc9160 ffff81022ecd0348 000000008048b6cb 0000000000000086 ffff81022c4e8d30 0000000000000000 ffffffff80247537 Call Trace: [<ffffffff803ba6f2>] kobject_get+0x12/0x17 [<ffffffff80247537>] getnstimeofday+0x2f/0x83 [<ffffffff8029c1ac>] sync_buffer+0x0/0x3f [<ffffffff8066d195>] io_schedule+0x5d/0x9f [<ffffffff8029c1e7>] sync_buffer+0x3b/0x3f [<ffffffff8066d3f0>] __wait_on_bit+0x40/0x6f [<ffffffff8029c1ac>] sync_buffer+0x0/0x3f [<ffffffff8066d48b>] out_of_line_wait_on_bit+0x6c/0x78 [<ffffffff80243909>] wake_bit_function+0x0/0x23 [<ffffffff8029e3ad>] sync_dirty_buffer+0x98/0xcb [<ffffffff8030056b>] journal_commit_transaction+0x97d/0xcb6 [<ffffffff8023a676>] lock_timer_base+0x26/0x4b [<ffffffff8030300a>] kjournald+0xc1/0x1fb [<ffffffff802438db>] autoremove_wake_function+0x0/0x2e [<ffffffff80302f49>] kjournald+0x0/0x1fb [<ffffffff802437bb>] kthread+0x47/0x74 [<ffffffff8022de51>] schedule_tail+0x28/0x5d [<ffffffff8020cac8>] child_rip+0xa/0x12 [<ffffffff80243774>] kthread+0x0/0x74 [<ffffffff8020cabe>] child_rip+0x0/0x12 Lin Ming confirms that this patch fixes the issue. I've run tests with it for the past week and no ill effects have been observed, so I'm proposing it for inclusion into 2.6.26. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>