aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2008-05-09[POWERPC] Remove leftover printk in isa-bridge.cNate Case
This printk() appears twice in the same function. Only the latter one in the inval_range: section appears to be legitimate. Signed-off-by: Nate Case <ncase@xes-inc.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-05-09[POWERPC] Remove duplicate #includeHuang Weiyi
Remove duplicate #include of <asm/prom.h> in arch/powerpc/kernel/btext.c. Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-05-09[POWERPC] Initialize lockdep earlierBenjamin Herrenschmidt
This moves lockdep_init() to before udbg_early_init() as the later can call things that acquire spinlocks etc... This also makes printk safer to use earlier. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-05-09[POWERPC] Document when printk is useableBenjamin Herrenschmidt
When debugging early boot problems, it's common to sprinkle printk's all over the place. However, on 64-bit powerpc, this can lead to memory corruption if done too early due to the PACA pointer and lockdep core not being initialized. This adds some comments to early_setup() that document when it is safe to do so in order to save time for whoever has to debug that stuff next. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-05-09[POWERPC] Fix bogus paca->_current initializationBenjamin Herrenschmidt
When doing lockdep, I had two patches to initialize paca->_current early, one bogus, and one correct. Unfortunately both got merged as the bad one ended up being part of the main lockdep patch by mistake. This causes memory corruption at boot. This removes the offending code. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-05-09[POWERPC] Fix of_i2c include for module compilationJochen Friedrich
Remove #ifdef CONFIG_OF_I2C as this breaks module compilation. Drivers using this header should depend on OF_I2C anyways, so there's no need to make this conditional. Signed-off-by: Jochen Friedrich <jochen@scram.de> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-05-09[POWERPC] Make default cputable entries reflect selected CPU familyBenjamin Herrenschmidt
Changes the cputable so that various CPU families that have an exclusive CONFIG_ option have a more sensible default entry to use if the specific processor hasn't been identified. This makes the kernel more generally useful when booted on an unknown PVR for things like new 4xx variants. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-05-09Merge branch 'for-2.6.26' of ↵Paul Mackerras
master.kernel.org:/pub/scm/linux/kernel/git/jwboyer/powerpc-4xx into merge
2008-05-09sh: Fix DMAC base address for SH7709SSteve Glendinning
On SH7709S, DMAC can be found at 0xa4000020 (as with most of the other sh3 cpu subtypes). Split out definition of DMAC base address from definitions of DMTE irqs. Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-05-08sit: Add missing kfree_skb() on pskb_may_pull() failure.David S. Miller
Noticed by Paul Marks <paul@pmarks.net>. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-09sh: update smc91x platform data for se7206.Paul Mundt
Follows the se7722 change. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-05-08tipc: Increase buffer header to support worst-case deviceAllan Stephens
This patch increases the headroom TIPC reserves in each sk_buff to accommodate the largest possible link level device header. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-08sparc video: remove open boot prom codeRobert Reif
Replace remaining open boot prom code with of. Boot tested on sparc32 and compile tested on sparc64. Signed-off-by: Robert Reif <reif@earthlink.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-09[CIFS] fix build warningSteve French
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-05-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (32 commits) net: Added ASSERT_RTNL() to dev_open() and dev_close(). can: Fix can_send() handling on dev_queue_xmit() failures netns: Fix arbitrary net_device-s corruptions on net_ns stop. netfilter: Kconfig: default DCCP/SCTP conntrack support to the protocol config values netfilter: nf_conntrack_sip: restrict RTP expect flushing on error to last request macvlan: Fix memleak on device removal/crash on module removal net/ipv4: correct RFC 1122 section reference in comment tcp FRTO: SACK variant is errorneously used with NewReno e1000e: don't return half-read eeprom on error ucc_geth: Don't use RX clock as TX clock. cxgb3: Use CAP_SYS_RAWIO for firmware pcnet32: delete non NAPI code from driver. fs_enet: Fix a memory leak in fs_enet_mdio_probe [netdrvr] eexpress: IPv6 fails - multicast problems 3c59x: use netstats in net_device structure 3c980-TX needs EXTRA_PREAMBLE fix warning in drivers/net/appletalk/cops.c e1000e: Add support for BM PHYs on ICH9 uli526x: fix endianness issues in the setup frame uli526x: initialize the hardware prior to requesting interrupts ...
2008-05-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sparc: Fix SA_ONSTACK signal handling.
2008-05-08Revert "PCI: remove default PCI expansion ROM memory allocation"Linus Torvalds
This reverts commit 9f8daccaa05c14e5643bdd4faf5aed9cc8e6f11e, which was reported to break X startup (xf86-video-ati-6.8.0). See http://bugs.freedesktop.org/show_bug.cgi?id=15523 for details. Reported-by: Laurence Withers <l@lwithers.me.uk> Cc: Gary Hade <garyhade@us.ibm.com> Cc: Greg KH <greg@kroah.com> Cc: Jan Beulich <jbeulich@novell.com> Cc: "Jun'ichi Nomura" <j-nomura@ce.jp.nec.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08[CIFS] Fixed build warning in is_ipIgor Mammedov
Signed-off-by: Igor Mammedov <niallain@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-05-08Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-fixes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-fixes: sched: fix weight calculations semaphore: fix
2008-05-08Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: [ALSA] soc at91 minor bug fixes [ALSA] soc - at91-pcm - Fix line wrapping pcspkr: fix dependancies
2008-05-08Remove duplicated include in net/sunrpc/svc.cHuang Weiyi
<linux/sched.h> we included twice. Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08fs/proc/task_mmu.c: remove duplicated include filesHuang Weiyi
Removed duplicated include files <linux/ptrace.h> and <linux/seq_file.h> in fs/proc/task_mmu.c. Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08Fix drivers/media build for modular buildsIngo Molnar
Fix allmodconfig build bug introduced in latest -git by commit 7c91f0624a9 ("V4L/DVB(7767): Move tuners to common/tuners"): LD kernel/built-in.o LD drivers/built-in.o ld: drivers/media/built-in.o: No such file: No such file or directory which happens if all media drivers are modular: http://redhat.com/~mingo/misc/config-Wed_Apr_30_09_24_48_CEST_2008.bad In that case there's no obj-y rule connecting all the built-in.o files and the link tree breaks. The fix is to add a guaranteed obj-y rule for the core vmlinux to build. (which results in an empty object file if all media drivers are modular) Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/ehca: Wait for async events to finish before destroying QP IB/ipath: Fix SDMA error recovery in absence of link status change IB/ipath: Need to always request and handle PIO avail interrupts IB/ipath: Fix count of packets received by kernel IB/ipath: Return the correct opcode for RDMA WRITE with immediate IB/ipath: Fix bug that can leave sends disabled after freeze recovery IB/ipath: Only increment SSN if WQE is put on send queue IB/ipath: Only warn about prototype chip during init RDMA/cxgb3: Fix severe limit on userspace memory registration size RDMA/cxgb3: Don't add PBL memory to gen_pool in chunks
2008-05-08MN10300: Make cpu_relax() invoke barrier()David Howells
Make cpu_relax() invoke barrier() to be the same as other arches. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: Revert "relay: fix splice problem" docbook: fix bio missing parameter block: use unitialized_var() in bio_alloc_bioset() block: avoid duplicate calls to get_part() in disk stat code cfq-iosched: make io priorities inherit CPU scheduling class as well as nice block: optimize generic_unplug_device() block: get rid of likely/unlikely predictions in merge logic vfs: splice remove_suid() cleanup cfq-iosched: fix RCU race in the cfq io_context destructor handling block: adjust tagging function queue bit locking block: sysfs store function needs to grab queue_lock and use queue_flag_*()
2008-05-08Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6: udf: Fix memory corruption when fs mounted with noadinicb option udf: Make udf exportable udf: fs/udf/partition.c:udf_get_pblock() mustn't be inline
2008-05-08Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6Linus Torvalds
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: [S390] guest page hinting light [S390] tty3270: fix put_char fail/success conversion. [S390] compat ptrace cleanup [S390] s390mach compile warning [S390] cio: Fix parsing mechanism for blacklisted devices. [S390] cio: Remove cio_msg kernel parameter. [S390] s390-kvm: leave sie context on work. Removes preemption requirement [S390] s390: Optimize user and work TIF check
2008-05-08drivers/scsi/dpt_i2o.c: fix build on alphaAndrew Morton
alpha: drivers/scsi/dpt_i2o.c:1997: error: implicit declaration of function 'adpt_alpha_info' drivers/scsi/dpt_i2o.c: At top level: drivers/scsi/dpt_i2o.c:2032: warning: conflicting types for 'adpt_alpha_info' drivers/scsi/dpt_i2o.c:2032: error: static declaration of 'adpt_alpha_info' follows non-static declaration drivers/scsi/dpt_i2o.c:1997: error: previous implicit declaration of 'adpt_alpha_info' was here Due to a copy-n-paste error in drivers/scsi/dpti.h. Fix that up and remove some of the many daft static-declarations-in-a-header which this driver enjoys. Cc: Miquel van Smoorenburg <miquels@cistron.nl> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08Fix cpuset sched_relax_domain_level control filePaul Menage
Due to a merge conflict, the sched_relax_domain_level control file was marked as being handled by cpuset_read/write_u64, but the code to handle it was actually in cpuset_common_file_read/write. Since the value being written/read is in fact a signed integer, it should be treated as such; this patch adds cpuset_read/write_s64 functions, and uses them to handle the sched_relax_domain_level file. With this patch, the sched_relax_domain_level can be read and written, and the correct contents seen/updated. Signed-off-by: Paul Menage <menage@google.com> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Paul Jackson <pj@sgi.com> Cc: Ingo Molnar <mingo@elte.hu> Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08slub: fix atomic usage in any_slab_objects()Benjamin Herrenschmidt
any_slab_objects() does an atomic_read on an atomic_long_t, this fixes it to use atomic_long_read instead. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Christoph Lameter <clameter@sgi.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08sys_pipe(): fix file descriptor leaksUlrich Drepper
Remember to close the files if copy_to_user() failed. Spotted by dm.n9107@gmail.com. Signed-off-by: Ulrich Drepper <drepper@redhat.com> Cc: DM <dm.n9107@gmail.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08vt: fix canonical input in UTF-8 modeSamuel Thibault
For e.g. proper TTY canonical support, IUTF8 termios flag has to be set as appropriate. Linux used to not care about setting that flag for VT TTYs. This patch fixes that by activating it according to the current mode of the VT, and sets the default value according to the vt.default_utf8 parameter. Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Cc: Willy Tarreau <w@1wt.eu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08usb/asix: add Buffalo LUA-U2-GT 10/100/1000Mattia Dongili
The USB net adapter Buffalo LUA-U2-GT (0411:006e) carries a AX88178 chip. Tested on the above HW. Signed-off-by: Mattia Dongili <malattia@linux.it> Acked-off-by: David Hollis <dhollis@davehollis.com> Cc: Greg KH <greg@kroah.com> Acked-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08sx.c: fix printk warnings on sparc32Andrew Morton
drivers/char/sx.c: In function 'sx_set_real_termios': drivers/char/sx.c:973: warning: format '%u' expects type 'unsigned int', but argument 2 has type 'long unsigned int' drivers/char/sx.c:999: warning: format '%x' expects type 'unsigned int', but argument 2 has type 'tcflag_t' drivers/char/sx.c:1012: warning: format '%x' expects type 'unsigned int', but argument 2 has type 'tcflag_t' sparc32 seems to use weird types for its tty things. [ Fine by me but this is ancient debug and most of the debug in sx just wants deleting eventually. - Alan ] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08uml: fix inconsistence due to tty_operation changeWANG Cong
'put_char' of 'struct tty_operations' has changed from 'void' into 'int'. This can also shut up compiler warnings. Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: WANG Cong <wangcong@zeuux.org> Acked-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08misc: fix integer as NULL pointer warningsHarvey Harrison
drivers/md/raid10.c:889:17: warning: Using plain integer as NULL pointer drivers/media/video/cx18/cx18-driver.c:616:12: warning: Using plain integer as NULL pointer sound/oss/kahlua.c:70:12: warning: Using plain integer as NULL pointer Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Neil Brown <neilb@suse.de> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08fix irq flags for iuu_phoenix.cSteven Rostedt
The file drivers/usb/serial/iuu_phoenix.c uses "int" for flags. This can cause hard to find bugs on some architectures. This patch converts the flags to use "long" instead. This bug was discovered by doing an allyesconfig make on the -rt kernel where checks are done to ensure all flags are of size sizeof(long). Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08fix irq flags in rtc-ds1511Steven Rostedt
The file in drivers/rtc/rtc-ds1551.c uses "int" for flags. This can cause hard to find bugs on some architectures. This patch converts the flags to use "long" instead. This bug was discovered by doing an allyesconfig make on the -rt kernel where checks are done to ensure all flags are of size sizeof(long). Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08fix irq flags in saa7134Steven Rostedt
Some files in the drivers/media/video/saa7134 directory uses "int" for flags. This can cause hard to find bugs on some architectures. This patch converts the flags to use "long" instead. This bug was discovered by doing an allyesconfig make on the -rt kernel where checks are done to ensure all flags are of size sizeof(long). Signed-off-by: Steven Rostedt <srostedt@redhat.com> Acked-by: Mauro Carvalho Chehab <mchehab@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08fix irq flags in mac80211 codeSteven Rostedt
A file in the net/mac80211 directory uses "int" for flags. This can cause hard to find bugs on some architectures. This patch converts the flags to use "long" instead. This bug was discovered by doing an allyesconfig make on the -rt kernel where checks are done to ensure all flags are of size sizeof(long). Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: "John W. Linville" <linville@tuxdriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08serial: access after NULL check in uart_flush_buffer()Tetsuo Handa
I noticed that static void uart_flush_buffer(struct tty_struct *tty) { struct uart_state *state = tty->driver_data; struct uart_port *port = state->port; unsigned long flags; /* * This means you called this function _after_ the port was * closed. No cookie for you. */ if (!state || !state->info) { WARN_ON(1); return; } is too late for checking state != NULL. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08Kconfig: improved help for CONFIG_ACCESSIBILITYSamuel Thibault
Add a small explanation of what accessibility is. Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08[ARM] pxa: Fix RCSR handlingRussell King
Related to d3930614e68bdf83a120d904c039a64e9f75dba1. RCSR is only present on PXA2xx CPUs, not on PXA3xx CPUs. Therefore, we should not be unconditionally writing to RCSR from generic code. Since we now clear the RCSR status from the SoC specific PXA PM code and before reset in the arch_reset() function, the duplication in the corgi, poodle, spitz and tosa code can be removed. Acked-by: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-05-08sched: fix weight calculationsMike Galbraith
The conversion between virtual and real time is as follows: dvt = rw/w * dt <=> dt = w/rw * dvt Since we want the fair sleeper granularity to be in real time, we actually need to do: dvt = - rw/w * l This bug could be related to the regression reported by Yanmin Zhang: | Comparing with kernel 2.6.25, sysbench+mysql(oltp, readonly) has lots | of regressions with 2.6.26-rc1: | | 1) 8-core stoakley: 28%; | 2) 16-core tigerton: 20%; | 3) Itanium Montvale: 50%. Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-08semaphore: fixIngo Molnar
Yanmin Zhang reported: | Comparing with kernel 2.6.25, AIM7 (use tmpfs) has more th | regression under 2.6.26-rc1 on my 8-core stoakley, 16-core tigerton, | and Itanium Montecito. Bisect located the patch below: | | 64ac24e738823161693bf791f87adc802cf529ff is first bad commit | commit 64ac24e738823161693bf791f87adc802cf529ff | Author: Matthew Wilcox <matthew@wil.cx> | Date: Fri Mar 7 21:55:58 2008 -0500 | | Generic semaphore implementation | | After I manually reverted the patch against 2.6.26-rc1 while fixing | lots of conflicts/errors, aim7 regression became less than 2%. i reproduced the AIM7 workload and can confirm Yanmin's findings that -.26-rc1 regresses over .25 - by over 67% here. Looking at the workload i found and fixed what i believe to be the real bug causing the AIM7 regression: it was inefficient wakeup / scheduling / locking behavior of the new generic semaphore code, causing suboptimal performance. The problem comes from the following code. The new semaphore code does this on down(): spin_lock_irqsave(&sem->lock, flags); if (likely(sem->count > 0)) sem->count--; else __down(sem); spin_unlock_irqrestore(&sem->lock, flags); and this on up(): spin_lock_irqsave(&sem->lock, flags); if (likely(list_empty(&sem->wait_list))) sem->count++; else __up(sem); spin_unlock_irqrestore(&sem->lock, flags); where __up() does: list_del(&waiter->list); waiter->up = 1; wake_up_process(waiter->task); and where __down() does this in essence: list_add_tail(&waiter.list, &sem->wait_list); waiter.task = task; waiter.up = 0; for (;;) { [...] spin_unlock_irq(&sem->lock); timeout = schedule_timeout(timeout); spin_lock_irq(&sem->lock); if (waiter.up) return 0; } the fastpath looks good and obvious, but note the following property of the contended path: if there's a task on the ->wait_list, the up() of the current owner will "pass over" ownership to that waiting task, in a wake-one manner, via the waiter->up flag and by removing the waiter from the wait list. That is all and fine in principle, but as implemented in kernel/semaphore.c it also creates a nasty, hidden source of contention! The contention comes from the following property of the new semaphore code: the new owner owns the semaphore exclusively, even if it is not running yet. So if the old owner, even if just a few instructions later, does a down() [lock_kernel()] again, it will be blocked and will have to wait on the new owner to eventually be scheduled (possibly on another CPU)! Or if another task gets to lock_kernel() sooner than the "new owner" scheduled, it will be blocked unnecessarily and for a very long time when there are 2000 tasks running. I.e. the implementation of the new semaphores code does wake-one and lock ownership in a very restrictive way - it does not allow opportunistic re-locking of the lock at all and keeps the scheduler from picking task order intelligently. This kind of scheduling, with 2000 AIM7 processes running, creates awful cross-scheduling between those 2000 tasks, causes reduced parallelism, a throttled runqueue length and a lot of idle time. With increasing number of CPUs it causes an exponentially worse behavior in AIM7, as the chance for a newly woken new-owner task to actually run anytime soon is less and less likely. Note that it takes just a tiny bit of contention for the 'new-semaphore catastrophy' to happen: the wakeup latencies get added to whatever small contention there is, and quickly snowball out of control! I believe Yanmin's findings and numbers support this analysis too. The best fix for this problem is to use the same scheduling logic that the kernel/mutex.c code uses: keep the wake-one behavior (that is OK and wanted because we do not want to over-schedule), but also allow opportunistic locking of the lock even if a wakee is already "in flight". The patch below implements this new logic. With this patch applied the AIM7 regression is largely fixed on my quad testbox: # v2.6.25 vanilla: .................. Tasks Jobs/Min JTI Real CPU Jobs/sec/task 2000 56096.4 91 207.5 789.7 0.4675 2000 55894.4 94 208.2 792.7 0.4658 # v2.6.26-rc1-166-gc0a1811 vanilla: ................................... Tasks Jobs/Min JTI Real CPU Jobs/sec/task 2000 33230.6 83 350.3 784.5 0.2769 2000 31778.1 86 366.3 783.6 0.2648 # v2.6.26-rc1-166-gc0a1811 + semaphore-speedup: ............................................... Tasks Jobs/Min JTI Real CPU Jobs/sec/task 2000 55707.1 92 209.0 795.6 0.4642 2000 55704.4 96 209.0 796.0 0.4642 i.e. a 67% speedup. We are now back to within 1% of the v2.6.25 performance levels and have zero idle time during the test, as expected. Btw., interactivity also improved dramatically with the fix - for example console-switching became almost instantaneous during this workload (which after all is running 2000 tasks at once!), without the patch it was stuck for a minute at times. There's another nice side-effect of this speedup patch, the new generic semaphore code got even smaller: text data bss dec hex filename 1241 0 0 1241 4d9 semaphore.o.before 1207 0 0 1207 4b7 semaphore.o.after (because the waiter.up complication got removed.) Longer-term we should look into using the mutex code for the generic semaphore code as well - but i's not easy due to legacies and it's outside of the scope of v2.6.26 and outside the scope of this patch as well. Bisected-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-08x86: cleanup PAT cpu validationThomas Gleixner
Move the scattered checks for PAT support to a single function. Its moved to addon_cpuid_features.c as this file is shared between 32 and 64 bit. Remove the manipulation of the PAT feature bit and just disable PAT in the PAT layer, based on the PAT bit provided by the CPU and the current CPU version/model white list. Change the boot CPU check so it works on Voyager somewhere in the future as well :) Also panic, when a secondary has PAT disabled but the primary one has alrady switched to PAT. We have no way to undo that. The white list is kept for now to ensure that we can rely on known to work CPU types and concentrate on the software induced problems instead of fighthing CPU erratas and subtle wreckage caused by not yet verified CPUs. Once the PAT code has stabilized enough, we can remove the white list and open the can of worms. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-08x86: geode: define geode_has_vsa2() even if CONFIG_MGEODE_LX is not setAndres Salomon
We want drivers to be able to use geode_has_vsa2 without having to worry about what model geode is being compiled for. This patch ensures that geode_has_vsa2 is always defined. Signed-off-by: Andres Salomon <dilinger@debian.org> Cc: Jordan Crouse <jordan.crouse@amd.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-08x86: GEODE: cache results from geode_has_vsa2() and uninlineAndres Salomon
This moves geode_has_vsa2 into a .c file, caches the result we get from the VSA virtual registers, and causes the function to no longer be inline. [akpm@linux-foundation.org: cleanup] Signed-off-by: Andres Salomon <dilinger@debian.org> Cc: Jordan Crouse <jordan.crouse@amd.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-08x86: revert geode config dependencyThomas Gleixner
commit e26a28d190304d910ee49b81cbfe6d9241f56e86 x86: olpc build fix was a fix to a patch that was withdrawn/delayed and then erroneously commited to x86.git. Revert it. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>