Age | Commit message (Collapse) | Author |
|
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
[MIPS] PNX8550: Fix system timer support
[MIPS] TX49: Fix use of CDEX build_store_reg()
[MIPS] pnx8550: Fix write_config_byte() PCI config space accessor
[MIPS] Fix build errors on SEAD
[MIPS] SMTC build fix
[MIPS] csum_partial and copy in parallel
[MIPS] Malta: Add missing MTD file.
|
|
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] Provide basic printk_clock() implementation
[ARM] Resolve fuse and direct-IO failures due to missing cache flushes
[ARM] pass vma for flush_anon_page()
[ARM] Fix potential MMCI bug
[ARM] Fix kernel-mode undefined instruction aborts
[ARM] 4082/1: iop3xx: fix iop33x gpio register offset
[ARM] 4070/1: arch/arm/kernel: fix warnings from missing includes
[ARM] 4079/1: iop: Update MAINTAINERS
|
|
This reverts commit b026872601976f666bae77b609dc490d1834bf77, which has
been linked to several problem reports with IO-APIC and the timer.
Machines either don't boot because the timer doesn't happen, or we get
double timer interrupts because we end up double-routing the timer irq
through multiple interfaces.
See for example
http://lkml.org/lkml/2006/12/16/101
http://lkml.org/lkml/2007/1/3/9
http://bugzilla.kernel.org/show_bug.cgi?id=7789
about some of the discussion.
Patches to fix this cleanup exist (and have been confirmed to work fine
at least for some of the affected cases) and we'll revisit it for
2.6.21, but this late in the -rc series we're better off just reverting
the incomplete commit that caused the problems.
Suggested-by: Adrian Bunk <bunk@stusta.de>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Yinghai Lu <yinghai.lu@amd.com>
Cc: Andrew Morton <akpm@osdl.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
the patch inlined below restores proper time accounting for PNX8550-based
boards. It also gets rid of #ifdef in the generic code which becomes
unnecessary then.
It's functionally identical to the previous patch with the same name but
it has minor comments from Atsushi and Sergei taken into account.
Signed-off-by: Vitaly Wool <vwool@ru.mvista.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
The commit a923660d786a53e78834b19062f7af2535f7f8ad accidently
prevents TX49 from using CDEX. Use build_dst_pref() only if prefetch
for store was really available.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
There's a serious typo in the function:
arch/mips/pci/ops-pnx8550.c:write_config_byte()
The parameter passed to the function config_access() is PCI_CMD_CONFIG_READ
instead of PCI_CMD_CONFIG_WRITE. This renders any attempts to write
a single byte to the PCI configuration registers useless.
This problem does not exist for write_config_word() nor write_config_dword().
This problem has been there since kernel v2.6.17 and is still there
as of kernel v2.6.19.1.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Quick and dirty fix for build errors on SEAD.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Implement optimized asm version of csum_partial_copy_nocheck,
csum_partial_copy_from_user and csum_and_copy_to_user which can do
calculate and copy in parallel, based on memcpy.S.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Current sched_clock() implementations on ARM cause unbootable kernels
with PRINTK_TIME support enabled. To avoid this, provide a basic
printk_clock() implementation which avoids sched_clock() being called
before the page tables have been set up.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
fuse does not work on ARM due to cache incoherency issues - fuse wants
to use get_user_pages() to copy data from the current process into
kernel space. However, since this accesses userspace via the kernel
mapping, the kernel mapping can be out of date wrt data written to
userspace.
This can lead to unpredictable behaviour (in the case of fuse) or data
corruption for direct-IO.
This resolves debian bug #402876
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
If the kernel attempts to execute a CP1 or CP2 instruction and it
aborts, and a FP emulator is not loaded, we try to return as if to
a user context, instead of the proper kernel context. Since the
fault came from kernel mode, we must use the kernel return paths.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Include <asm/io.h> to fix the warning:
arch/arm/kernel/traps.c:647:6: warning: symbol '__readwrite_bug' was not declared. Should it be static?
Include <linux/mc146818rtc.h> to fix the warning:
arch/arm/kernel/time.c:42:1: warning: symbol 'rtc_lock' was not declared. Should it be static?
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
o Currently synchronize_tsc_ap() is of type __init. It is called by
smp_callin() which is of type __cpuinit. So synchronize_tsc_ap()
should be of type __cpuinit.
o Modpost generates warnings for i386 if CONFIG_RELOCATABLE=y and
CONFIG_HOTPLUG_CPU=y
WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between 'start_secondary' (at offset 0xc01164dc) and 'initialize_secondary'
WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between 'start_secondary' (at offset 0xc01164e8) and 'initialize_secondary'
o tsc is of type __initdata. It should be of type __cpuinitdata.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
o MODPOST generates warning for i386 if kernel is compiled with
CONFIG_RELOCATABLE=y
WARNING: vmlinux - Section mismatch: reference to .init.data: from .data between 'this_cpu' (at offset 0xc05194d0) and 'cpuinfo_op'
o this_cpu pointer should be of type __cpuinitdata.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
o MODPOST generates warning for i386 if kernel is compiled with
CONFIG_RELOCATABLE=y
WARNING: vmlinux - Section mismatch: reference to .init.text:startup_32_smp
from .data between 'trampoline_data' (at offset 0xc0519cf8) and 'boot_gdt'
o trampoline code/data can go into init section is CPU hotplug is not
enabled.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
o Relocatable bzImage support had got rid of CONFIG_PHYSICAL_START option
thinking that now this option is not required as people can build a
second kernel as relocatable and load it anywhere. So need of compiling
the kernel for a custom address was gone. But Magnus uses vmlinux images
for second kernel in Xen environment and he wants to continue to use
it.
o Restoring the CONFIG_PHYSICAL_START option for the time being. I think
down the line we can get rid of it.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
ACPI: asus_acpi: new MAINTAINER
ACPI: fix section mis-match build warning
ACPI: increase ACPI_MAX_REFERENCE_COUNT for larger systems
ACPI: EC: move verbose printk to debug build only
backlight: fix backlight_device_register compile failures
|
|
* master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq:
[CPUFREQ] longhaul: Kill off warnings introduced by recent changes.
[CPUFREQ] Uninitialized use of cmd.val in arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c:acpi_cpufreq_target()
[CPUFREQ] Longhaul - Always guess FSB
[CPUFREQ] Longhaul - Fix up powersaver assumptions.
[CPUFREQ] longhaul: Fix up unreachable code.
[CPUFREQ] speedstep-centrino: missing space and bracket
[CPUFREQ] Bug fix for acpi-cpufreq and cpufreq_stats oops on frequency change notification
[CPUFREQ] select consistently
|
|
If caller passed the tsk, we should use it to validate a stack ptr.
Otherwise, sysrq-t and other debugging stuff doesn't work.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Bunch of unused vars + one case where gcc isn't smart enough.
Signed-off-by: Dave Jones <davej@redhat.com>
|
|
arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c:acpi_cpufreq_target()
cmd.val was used uninitialized on the line below.
Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr>
Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
|
|
This is patch that solves Ebox mini PC issue and make
FSB code more specification compilant. At start guess_fsb
function is guessing 200MHz FSB too. It is better to
make it in this way because, thanks to this function, driver
will fail for bogus FSB values caused by bogus multiplier
value. For PowerSaver processors we can't depend on Max /
MinMHzFSB because these values are only used for
PowerSaver 2.0 and 3.0. Most processors on which Longhaul
is used are PowerSaver 1.0 only. I'm changing code for older
CPU's too, but not so much as previously, and this code was
already used for Ezra. Using MinMHzBR for Ezra-T is outside
spec. It is for voltage scaling purpose and don't have to
be equal to minmult (but it is). Same for Nehemiah (it
isn't for sure). Added mult - current multiplier value.
Signed-off-by: Rafa³ Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
|
|
* 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 4081/1: Add definition for TI Sync Serial Protocol
[ARM] 4080/1: Fix for the SSCR0_SlotsPerFrm macro
[ARM] Fix VFP initialisation issue for SMP systems
[ARM] 4078/1: Fix ARM copypage cache coherency problems
[ARM] 4077/1: iop13xx: fix __io() macro
[ARM] 4074/1: Flat loader stack alignment
[ARM] 4073/1: Prevent s3c24xx drivers from including asm/arch/hardware.h and asm/arch/irqs.h
[ARM] 4071/1: S3C24XX: Documentation update
[ARM] 4066/1: correct a comment about PXA's sched_clock range
[ARM] 4065/1: S3C24XX: dma printk fixes
[ARM] 4064/1: make pxa_get_cycles() static
[ARM] 4063/1: ep93xx: fix IRQ_EP93XX_GPIO?MUX numbering
|
|
When we install the handlers for context switching, we must enable
VFP on all CPU cores, otherwise undefined (and random) effects
occur.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Commit 968de4f02621db35b8ae5239c8cfc6664fb872d8 ("i386: Relocatable
kernel support") caused problems for people with old binutils versions
that didn't mark ".text.*" sections automatically allocated.
So we should use .section command to specifically mark .text.head
section as AX (allocatable and executable) to solve the problem.
This should be unnecessary with binutils 2.15 and later, which is
already three years old, but it doesn't hurt supporting older toolchains
where possible.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Dunno why this pops out in only in the allmodconfig build.
Though the warning is accurate, all the callers of the flagged
non __init function are __init, this is not a functional change.
WARNING: vmlinux - Section mismatch: reference to .init.data:acpi_sci_flags from .text between 'acpi_sci_ioapic_setup' (at offset 0xc010f0a
6) and 'acpi_gsi_to_irq' WARNING: vmlinux - Section mismatch: reference to .init.text:mp_override_legacy_irq from .text between 'acpi_sci_ioapic_setup' (at offset 0
xc010f0de) and 'acpi_gsi_to_irq' WARNING: vmlinux - Section mismatch: reference to .init.data:acpi_sci_override_gsi from .text between 'acpi_sci_ioapic_setup' (at offset 0x
c010f0e4) and 'acpi_gsi_to_irq'
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
CALGARY_IOMMU_ENABLED_BY_DEFAULT"
This reverts commit a9622f6219ce58faba1417743bf3078501eb3434. Now that
the Calgary code apparently detects itself properly, it's not needed any
more.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
And this points out that the return value from
isa_dev_get_resource() and the 'pregs' arg to
isa_dev_get_irq() are totally unused.
Based upon a patch from Richard Mortimer <richm@oldelvet.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We need to pass in the resource otherwise we cannot
release the region properly. We must know whether it is
an I/O or MEM resource.
Spotted by Eric Brower.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We were not being careful enough. When we trim the physical
memory areas, we have to make sure we don't remove the kernel
image or initial ramdisk image ranges.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add sg->offset to sg->dvma_address in pci_map_sg() on sparc32. Without the
offset, transfers to buffers that do not begin on a page boundary will not
work as expected.
Signed-off-by: Jan Andersson <jan.andersson@ieee.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Fix apollon board compiler error
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Fix GPMC compiler errors on OMAP2
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
The apple fn keys don't work anymore with 2.6.20-rc1.
The reason is that USB_HID_POWERBOOK appears in several files although
USB_HIDINPUT_POWERBOOK is the thing to be used.
The patch fixes this.
Cc: Greg KH <greg@kroah.com>
Cc: Dmitry Torokhov <dtor@mail.ru>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Due to the changes to make the kernel relocateable a new file is created
during the build process.
[jirislaby@gmail.com: The .gitigonre was intended to be in arch/ subtree]
Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
If PG_dcache_dirty is set for a page, we need to flush the source page
before performing any copypage operation using a different virtual address.
This fixes the copypage implementations for XScale, StrongARM and ARMv6.
This patch fixes segmentation faults seen in the dynamic linker under
the usage patterns in glibc 2.4/2.5.
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Since iop13xx defines the PCI I/O spaces with physical resource addresses
the __io macro needs to perform the physical to virtual conversion. I
incorrectly assumed that this would be handled by ioremap, but drivers
(like e1000) directly dereference the address returned from __io.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
The commit 505788cccbb96cd496b646594c8a5fcdc26bc2d9 in linus kernel tree
introduced some printks (for debugging ?) which are flooding the logs on
my h1940. This patch replace them with pr_debug calls.
Signed-off-by: Arnaud Patard <arnaud.patard@rtp-net.org>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
... and fix a comment as well.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
ACPI PM2 register was fallback for "Longhaul ver. 1" CPU's.
My assumption that this register isn't present at
"PowerSaver" motherboards is so far true, but current code
will not work correctly in other case. There are three possible
supports: ACPI C3, PM2 and northbridge. That was my assumption
that ACPI C3 and northbridge is for PS and northbridge and PM2
is for V1. In current code we can only check if it is ACPI
support or not by port22_en. So remove port22_en and add
longhaul_flags. If USE_ACPI_C3 and USE_NORTHBRIDGE are both
clear then it means ACPI PM2 support. Also change order of
support probe from ACPI C3, PM2, northbridge to ACPI C3,
northbridge, ACPI PM2. Paranoid protection against port 0x22
cast as ACPI PM2 register. Bit 1 clear in such case - lockup
on AGP DMA. And obvious (now) fixup for do_powersaver. Use
cx->address only for ACPI C3 ("PowerSaver" processor using
PM2 support).
Signed-off-by: Rafa¿ Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
|
|
Signed-off-by: Rafał Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
|
|
We use the fixmap for accessing pci config space in pci_mmcfg_read/write().
The problem is in pci_exp_set_dev_base(). It is caching a last
accessed address to avoid calling set_fixmap_nocache() whenever
pci_mmcfg_read/write() is used.
static inline void pci_exp_set_dev_base(int bus, int devfn)
{
u32 dev_base = base | (bus << 20) | (devfn << 12);
if (dev_base != mmcfg_last_accessed_device) {
mmcfg_last_accessed_device = dev_base;
set_fixmap_nocache(FIX_PCIE_MCFG, dev_base);
}
}
cpu0 cpu1
---------------------------------------------------------------------------
pci_mmcfg_read("device-A")
pci_exp_set_dev_base()
set_fixmap_nocache()
pci_mmcfg_read("device-B")
pci_exp_set_dev_base()
set_fixmap_nocache()
pci_mmcfg_read("device-B")
pci_exp_set_dev_base()
/* doesn't flush tlb */
But if cpus accessed the above order, the second pci_mmcfg_read() on
cpu0 doesn't flush the TLB, because "mmcfg_last_accessed_device" is
device-B. So, second pci_mmcfg_read() on cpu0 accesses a device-A via
a previous TLB cache. This problem became the cause of several strange
behavior.
This patches fixes this situation by adds "mmcfg_last_accessed_cpu" check.
[ Alternatively, we could make a per-cpu mapping area or something. Not
that it's probably worth it, but if we wanted to avoid all locking and
instead just disable preemption, that would be the way to go. --Linus ]
Signed-off-by: OGAWA Hirofumi <hogawa@miraclelinux.com>
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
A space and a bracket are missing (and indentation is wrong).
Signed-off-by: Brice Goglin <Brice.Goglin@ens-lyon.org>
Signed-off-by: Dave Jones <davej@redhat.com>
|
|
change notification
Fixes the oops in cpufreq_stats with acpi_cpufreq driver. The issue was
that the frequency was reported as 0 in acpi-cpufreq.c. The bug is due to
different indicies for freq_table and ACPI perf table.
Also adds a check in cpufreq_stats to check for error return from
freq_table_get_index() and avoid using the error return value.
Patch fixes the issue reported at
http://www.ussg.iu.edu/hypermail/linux/kernel/0611.2/0629.html
and also other similar issue here
http://bugme.osdl.org/show_bug.cgi?id=7383 comment 53
Signed-off-by: Dhaval Giani <dhaval.giani@gmail.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dave Jones <davej@redhat.com>
|
|
Make x86_64 ACPI_CPU_FREQ select CPU_FREQ_TABLE like other methods do.
(although we should still eliminate as much use of 'select' as possible)
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dave Jones <davej@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (68 commits)
ACPI: replace kmalloc+memset with kzalloc
ACPI: Add support for acpi_load_table/acpi_unload_table_id
fbdev: update after backlight argument change
ACPI: video: Add dev argument for backlight_device_register
ACPI: Implement acpi_video_get_next_level()
ACPI: Kconfig - depend on PM rather than selecting it
ACPI: fix NULL check in drivers/acpi/osl.c
ACPI: make drivers/acpi/ec.c:ec_ecdt static
ACPI: prevent processor module from loading on failures
ACPI: fix single linked list manipulation
ACPI: ibm_acpi: allow clean removal
ACPI: fix git automerge failure
ACPI: ibm_acpi: respond to workqueue update
ACPI: dock: add uevent to indicate change in device status
ACPI: ec: Lindent once again
ACPI: ec: Change #define to enums there possible.
ACPI: ec: Style changes.
ACPI: ec: Acquire Global Lock under EC mutex.
ACPI: ec: Drop udelay() from poll mode. Loop by reading status field instead.
ACPI: ec: Rename gpe_bit to gpe
...
|
|
APM idle code
Fernando Lopez-Lezcano reported frequent scheduling latencies and audio
xruns starting at the 2.6.18-rt kernel, and those problems persisted all
until current -rt kernels. The latencies were serious and unjustified by
system load, often in the milliseconds range.
After a patient and heroic multi-month effort of Fernando, where he
tested dozens of kernels, tried various configs, boot options,
test-patches of mine and provided latency traces of those incidents, the
following 'smoking gun' trace was captured by him:
_------=> CPU#
/ _-----=> irqs-off
| / _----=> need-resched
|| / _---=> hardirq/softirq
||| / _--=> preempt-depth
|||| /
||||| delay
cmd pid ||||| time | caller
\ / ||||| \ | /
IRQ_19-1479 1D..1 0us : __trace_start_sched_wakeup (try_to_wake_up)
IRQ_19-1479 1D..1 0us : __trace_start_sched_wakeup <<...>-5856> (37 0)
IRQ_19-1479 1D..1 0us : __trace_start_sched_wakeup (c01262ba 0 0)
IRQ_19-1479 1D..1 0us : resched_task (try_to_wake_up)
IRQ_19-1479 1D..1 0us : __spin_unlock_irqrestore (try_to_wake_up)
...
<idle>-0 1...1 11us!: default_idle (cpu_idle)
...
<idle>-0 0Dn.1 602us : smp_apic_timer_interrupt (c0103baf 1 0)
...
<...>-5856 0D..2 618us : __switch_to (__schedule)
<...>-5856 0D..2 618us : __schedule <<idle>-0> (20 162)
<...>-5856 0D..2 619us : __spin_unlock_irq (__schedule)
<...>-5856 0...1 619us : trace_stop_sched_switched (__schedule)
<...>-5856 0D..1 619us : trace_stop_sched_switched <<...>-5856> (37 0)
what is visible in this trace is that CPU#1 ran try_to_wake_up() for
PID:5856, it placed PID:5856 on CPU#0's runqueue and ran resched_task()
for CPU#0. But it decided to not send an IPI that no CPU - due to
TS_POLLING. But CPU#0 never woke up after its NEED_RESCHED bit was set,
and only rescheduled to PID:5856 upon the next lapic timer IRQ. The
result was a 600+ usecs latency and a missed wakeup!
the bug turned out to be an idle-wakeup bug introduced into the mainline
kernel this summer via an optimization in the x86_64 tree:
commit 495ab9c045e1b0e5c82951b762257fe1c9d81564
Author: Andi Kleen <ak@suse.de>
Date: Mon Jun 26 13:59:11 2006 +0200
[PATCH] i386/x86-64/ia64: Move polling flag into thread_info_status
During some profiling I noticed that default_idle causes a lot of
memory traffic. I think that is caused by the atomic operations
to clear/set the polling flag in thread_info. There is actually
no reason to make this atomic - only the idle thread does it
to itself, other CPUs only read it. So I moved it into ti->status.
the problem is this type of change:
if (!hlt_counter && boot_cpu_data.hlt_works_ok) {
- clear_thread_flag(TIF_POLLING_NRFLAG);
+ current_thread_info()->status &= ~TS_POLLING;
smp_mb__after_clear_bit();
while (!need_resched()) {
local_irq_disable();
this changes clear_thread_flag() to an explicit clearing of TS_POLLING.
clear_thread_flag() is defined as:
clear_bit(flag, &ti->flags);
and clear_bit() is a LOCK-ed atomic instruction on all x86 platforms:
static inline void clear_bit(int nr, volatile unsigned long * addr)
{
__asm__ __volatile__( LOCK_PREFIX
"btrl %1,%0"
hence smp_mb__after_clear_bit() is defined as a simple compile barrier:
#define smp_mb__after_clear_bit() barrier()
but the explicit TS_POLLING clearing introduced by the patch:
+ current_thread_info()->status &= ~TS_POLLING;
is not an atomic op! So the clearing of the TS_POLLING bit is freely
reorderable with the reading of the NEED_RESCHED bit - and both now
reside in different memory addresses.
CPU idle wakeup very much depends on ordered memory ops, the clearing of
the TS_POLLING flag must always be done before we test need_resched()
and hit the idle instruction(s). [Symmetrically, the wakeup code needs
to set NEED_RESCHED before it tests the TS_POLLING flag, so memory
ordering is paramount.]
Fernando's dual-core Athlon64 system has a sufficiently advanced memory
ordering model so that it triggered this scenario very often.
( And it also turned out that the reason why these latencies never
triggered on my testsystems is that i routinely use idle=poll, which
was the only idle variant not affected by this bug. )
The fix is to change the smp_mb__after_clear_bit() to an smp_mb(), to
act as an absolute barrier between the TS_POLLING write and the
NEED_RESCHED read. This affects almost all idling methods (default,
ACPI, APM), on all 3 x86 architectures: i386, x86_64, ia64.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Fernando Lopez-Lezcano <nando@ccrma.Stanford.EDU>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|