aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2009-06-03x86, mce: check early in exception handler if panic is neededAndi Kleen
The exception handler should behave differently if the exception is fatal versus one that can be returned from. In the first case it should never clear any registers because these need to be preserved for logging after the next boot. Otherwise it should clear them on each CPU step by step so that other CPUs sharing the same bank don't see duplicate events. Otherwise we risk reporting events multiple times on any CPUs which have shared machine check banks, which is a common problem on Intel Nehalem which has both SMT (two CPU threads sharing banks) and shared machine check banks in the uncore. Determine early in a special pass if any event requires a panic. This uses the mce_severity() function added earlier. This is needed for the next patch. Also fixes a problem together with an earlier patch that corrected events weren't logged on a fatal MCE. [ Impact: Feature ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-03x86, mce: add table driven machine check gradingAndi Kleen
The machine check grading (as in deciding what should be done for a given register value) has to be done multiple times soon and it's also getting more complicated. So it makes sense to consolidate it into a single function. To get smaller and more straight forward and possibly more extensible code I opted towards a new table driven method. The various rules are put into a table when is then executed by a very simple interpreter. The grading engine is in a new file mce-severity.c. I also added a private include file mce-internal.h, because mce.h is already a bit too cluttered. This is dead code right now, but will be used in followon patches. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-03x86, mce: remove TSC print heuristicAndi Kleen
Previously mce_panic used a simple heuristic to avoid printing old so far unreported machine check events on a mce panic. This worked by comparing the TSC value at the start of the machine check handler with the event time stamp and only printing newer ones. This has a couple of issues, in particular on systems where the TSC is not fully synchronized between CPUs it could lose events or print old ones. It is also problematic with full system synchronization as it is added by the next patch. Remove the TSC heuristic and instead replace it with a simple heuristic to print corrected errors first and after that uncorrected errors and finally the worst machine check as determined by the machine check handler. This simplifies the code because there is no need to pass the original TSC value around. Contains fixes from Ying Huang [ Impact: bug fix, cleanup ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Ying Huang <ying.huang@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-03x86, mce: log corrected errors when panicingAndi Kleen
Normally the machine check handler ignores corrected errors and leaves them to machine_check_poll(). But when panicing mcp won't run, so log all errors. Note: this can still miss some cases until the "early no way out" patch later is applied too. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-03x86, mce: extend struct mce user interface with more information.Andi Kleen
Experience has shown that struct mce which is used to pass an machine check to the user space daemon currently a few limitations. Also some data which is useful to print at panic level is also missing. This patch addresses most of them. The same information is also printed out together with mce panic. struct mce can be painlessly extended in a compatible way, the mcelog user space code just ignores additional fields with a warning. - It doesn't provide a wall time timestamp. There have been a few complaints about that. Fix that by adding a 64bit time_t - It doesn't provide the exact CPU identification. This makes it awkward for mcelog to decode the event correctly, especially when there are variations in the supported MCE codes on different CPU models or when mcelog is running on a different host after a panic. Previously the administrator had to specify the correct CPU when mcelog ran on a different host, but with the more variation in machine checks now it's better to auto detect that. It's also useful for more detailed analysis of CPU events. Pass CPUID 1.EAX and the cpu vendor (as encoded in processor.h) instead. - Socket ID and initial APIC ID are useful to report because they allow to identify the failing CPU in some (not all) cases. This is also especially useful for the panic situation. This addresses one of the complaints from Thomas Gleixner earlier. - The MCG capabilities MSR needs to be reported for some advanced error processing in mcelog Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-03x86, mce: support more than 256 CPUs in struct mceAndi Kleen
The old struct mce had a limitation to 256 CPUs. But x86 Linux supports more than that now with x2apic. Add a new field extcpu to report the extended number. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-03x86, mce: store record length into memory struct mce anchorAndi Kleen
This makes it easier for tools who want to extract the mcelog out of crash images or memory dumps to adapt to changing struct mce size. The length field replaces padding, so it's fully compatible. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-03x86, mce: add MCE poll count to /proc/interruptsAndi Kleen
Keep a count of the machine check polls (or CMCI events) in /proc/interrupts. Andi needs this for debugging, but it's also useful in general to see what's going in by the kernel. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-03x86, mce: add machine check exception count in /proc/interruptsAndi Kleen
Useful for debugging, but it's also good general policy to have a counter for all special interrupts there. This makes it easier to diagnose where a CPU is spending its time. [ Impact: feature, debugging tool ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-01Merge branch 'irq/numa' into x86/mce3H. Peter Anvin
Merge reason: arch/x86/kernel/irqinit_{32,64}.c unified in irq/numa and modified in x86/mce3; this merge resolves the conflict. Conflicts: arch/x86/kernel/irqinit.c Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-06-01Merge branch 'x86/cpufeature' into irq/numaIngo Molnar
Merge reason: irq/numa didnt build because this commit: 2759c32: x86: don't call read_apic_id if !cpu_has_apic Had a dependency on x86/cpufeature changes. Pull in that (small) branch to fix the dependency. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-01Merge branch 'linus' into irq/numaIngo Molnar
Conflicts: arch/mips/sibyte/bcm1480/irq.c arch/mips/sibyte/sb1250/irq.c Merge reason: we gathered a few conflicts plus update to latest upstream fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-01Merge branch 'hwmon-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging * 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging: hwmon: Update documentation on fan_max hwmon: (lm78) Add missing __devexit_p()
2009-06-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sparc64: Fix section attribute warnings. sparc64: Fix SET_PERSONALITY to not clip bits outside of PER_MASK.
2009-06-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: 3c509: Add missing EISA IDs MAINTAINERS: take maintainership of the cpmac Ethernet driver net/firmare: Ignore .cis files ath1e: add new device id for asus hardware mlx4_en: Fix a kernel panic when waking tx queue rtl8187: add USB ID for Linksys WUSB54GC-EU v2 USB wifi dongle at76c50x-usb: avoid mutex deadlock in at76_dwork_hw_scan mac8390: fix build with NET_POLL_CONTROLLER cxgb3: link fault fixes cxgb3: fix dma mapping regression netfilter: nfnetlink_log: fix wrong skbuff size calculation netfilter: xt_hashlimit does a wrong SEQ_SKIP bfin_mac: fix build error due to net_device_ops convert atlx: move modinfo data from atlx.h to atl1.c gianfar: fix babbling rx error event bug cls_cgroup: read classid atomically in classifier netfilter: nf_ct_dccp: add missing DCCP protocol changes in event cache netfilter: nf_ct_tcp: fix accepting invalid RST segments
2009-06-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/jaswinder/headers-check-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/jaswinder/headers-check-2.6: headers_check fix: linux/net_dropmon.h headers_check fix: linux/auto_fs.h
2009-06-01hwmon: Update documentation on fan_maxChristian Engelmayer
Add fan_max description. Add fan limit alarm 'max_alarm' to the alarm section. Signed-off-by: Christian Engelmayer <christian.engelmayer@frequentis.com> Acked-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2009-06-01hwmon: (lm78) Add missing __devexit_p()Mike Frysinger
The remove function uses __devexit, so the .remove assignment needs __devexit_p() to fix a build error with hotplug disabled. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2009-06-013c509: Add missing EISA IDsMaciej W. Rozycki
Several EISA device IDs for 3c509 family network cards are missing from the driver, making the cards unusable in their EISA mode. Here's a fix to add them based on the EISA configuration files distributed by 3Com and our eisa.ids database. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-01MAINTAINERS: take maintainership of the cpmac Ethernet driverFlorian Fainelli
This patch adds me as the maintainer of the CPMAC (AR7) Ethernet driver. Signed-off-by: Florian Fainelli <florian@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-01headers_check fix: linux/net_dropmon.hJaswinder Singh Rajput
fix the following 'make headers_check' warnings: usr/include/linux/net_dropmon.h:7: found __[us]{8,16,32,64} type without #include <linux/types.h> Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
2009-06-01headers_check fix: linux/auto_fs.hJaswinder Singh Rajput
fix the following 'make headers_check' warnings: usr/include/linux/auto_fs.h:17: include of <linux/types.h> is preferred over <asm/types.h> Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
2009-05-30Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: ide_pci_generic: add quirk for Netcell ATA RAID
2009-05-30ide_pci_generic: add quirk for Netcell ATA RAIDBartlomiej Zolnierkiewicz
We need to explicitly mark words 85-87 as valid ones since firmware doesn't do it. This should fix support for LBA48 and FLUSH CACHE [EXT] command which stopped working after we applied more strict checking of identify words in: commit 942dcd85bf8edf38cdc3745306ca250684d99a61 ("ide: idedisk_supports_lba48() -> ata_id_lba48_enabled()") and commit 4b58f17d7c45a8e5f4acda641bec388398b9c0fa ("ide: ide_id_has_flush_cache() -> ata_id_flush_enabled()") Reported-and-tested-by: "Trevor Hemsley" <trevor.hemsley@ntlworld.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-05-30Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2: nilfs2: fix bh leak in nilfs_cpfile_delete_checkpoints function
2009-05-30Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: ACPI, i915: build fix (v2) acpi-cpufreq: fix printk typo and indentation ACPI processor: remove spurious newline from warning message drm/i915: acpi/video.c fix section mismatch warning ACPI: video: DMI workaround broken Acer 5315 BIOS enabling display brightness ACPI: video: DMI workaround broken eMachines E510 BIOS enabling display brightness ACPI: sanity check _PSS frequency to prevent cpufreq crash i7300_idle: allow testing on i5000-series hardware w/o re-compile PCI/ACPI: fix wrong ref count handling in acpi_pci_bind() cpuidle: fix AMD C1E suspend hang cpuidle: makes AMD C1E work in acpi_idle
2009-05-30Merge branch 'fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: fsldma: Fix compile warnings fsldma: fix memory leak on error path in fsl_dma_prep_memcpy() fsldma: snooping is not enabled for last entry in descriptor chain fsldma: fix infinite loop on multi-descriptor DMA chain completion fsldma: fix "DMA halt timeout!" errors fsldma: fix check on potential fdev->chan[] overflow fsldma: update mailling list address in MAINTAINERS
2009-05-30nilfs2: fix bh leak in nilfs_cpfile_delete_checkpoints functionRyusuke Konishi
The nilfs_cpfile_delete_checkpoints() wrongly skips brelse() for the header block of checkpoint file in case of errors. This fixes the leak bug. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-05-29net/firmare: Ignore .cis filesMatt Kraai
Signed-off-by: Matt Kraai <kraai@ftbfs.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-29ath1e: add new device id for asus hardwareGreg Kroah-Hartman
Gary Lin reports that a new device id needs to be added to the atl1e in order to get some new Asus hardware to work properly. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-29mlx4_en: Fix a kernel panic when waking tx queueYevgeny Petrilin
When the transmit queue gets full we enable interrupts for TX completions There was a race that we handled the TX queue both from the interrupt context and from the transmit function. Using "spin_trylock_irq()" ensures this doesn't happen. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-29Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
2009-05-29Merge branches 'bugzilla-13121+', 'bugzilla-13233', ↵Len Brown
'redhat-bugzilla-500311', 'pci-bind-oops', 'misc-2.6.30' and 'i7300_idle' into release
2009-05-29ACPI, i915: build fix (v2)Len Brown
drivers/built-in.o: In function `intel_opregion_init': (.text+0x9d540): undefined reference to `acpi_video_register' v2: move under DRM_I915 from DRM_I915_KMS Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
2009-05-29acpi-cpufreq: fix printk typo and indentationJoe Perches
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com>
2009-05-29ACPI processor: remove spurious newline from warning messageFrans Pop
Commit 4973b22a ("ACPI processor: reset the throttling state once it's invalid") introduced a new warning which prints a spurious newline. The ACPI_WARNING macro that is used already takes care of adding a newline, after adding ACPI_CA_VERSION to the message. Remove the newline to avoid the message getting split into two lines. Signed-off-by: Frans Pop <elendil@planet.nl> Signed-off-by: Len Brown <len.brown@intel.com>
2009-05-29drm/i915: acpi/video.c fix section mismatch warningJaswinder Singh Rajput
Currently acpi_video_exit() is exported as well as using __exit which causes: WARNING: drivers/acpi/video.o(__ksymtab+0x0): Section mismatch in reference from the variable __ksymtab_acpi_video_exit to the function .exit.text:acpi_video_exit() The symbol acpi_video_exit is exported and annotated __exit Fix this by removing the __exit annotation of acpi_video_exit or drop the export. Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: Len Brown <len.brown@intel.com>
2009-05-29ACPI: video: DMI workaround broken Acer 5315 BIOS enabling display brightnessZhang Rui
http://bugzilla.kernel.org/show_bug.cgi?id=13121 Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2009-05-29ACPI: video: DMI workaround broken eMachines E510 BIOS enabling display ↵Zhang Rui
brightness http://bugzilla.kernel.org/show_bug.cgi?id=13376 Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2009-05-29ACPI: sanity check _PSS frequency to prevent cpufreq crashLen Brown
When BIOS SETUP is changed to disable EIST, some BIOS hand the OS an un-initialized _PSS: Name (_PSS, Package (0x06) { Package (0x06) { 0x80000000, // frequency [MHz] 0x80000000, // power [mW] 0x80000000, // latency [us] 0x80000000, // BM latency [us] 0x80000000, // control 0x80000000 // status }, ... These are outrageous values for frequency, power and latency, raising the question where to draw the line between legal and illegal. We tend to survive garbage in the power and latency fields, but we can BUG_ON when garbage is in the frequency field. Cpufreq multiplies the frequency by 1000 and stores it in a u32 KHz. So disregard a _PSS with a frequency so large that it can't be represented by cpufreq. https://bugzilla.redhat.com/show_bug.cgi?id=500311 Signed-off-by: Len Brown <len.brown@intel.com>
2009-05-29sparc64: Fix section attribute warnings.David S. Miller
CSUM copy to/from user assembler was missing allocatable and executable attributes for .fixup Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-29Merge master.kernel.org:/home/rmk/linux-2.6-armLinus Torvalds
* master.kernel.org:/home/rmk/linux-2.6-arm: [ARM] update mach-types [ARM] Add cmpxchg support for ARMv6+ systems (v5) [ARM] barriers: improve xchg, bitops and atomic SMP barriers Gemini: Fix SRAM/ROM location after memory swap MAINTAINER: Add F: entries for Gemini and FA526 [ARM] disable NX support for OABI-supporting kernels [ARM] add coherent DMA mask for mv643xx_eth [ARM] pxa/palm: fix PalmLD/T5/TX AC97 MFP [ARM] pxa: add parameter to clksrc_read() for pxa168/910 [ARM] pxa: fix the incorrectly defined drive strength macros for pxa{168,910} [ARM] Orion: Remove explicit name for platform device resources [ARM] Kirkwood: Correct MPP for SATA activity/presence LEDs of QNAP TS-119/TS-219. [ARM] pxa/ezx: fix pin configuration for low power mode [ARM] pxa/spitz: provide spitz_ohci_exit() that unregisters USB_HOST GPIO [ARM] pxa: enable GPIO receivers after configuring pins [ARM] pxa: allow gpio_reset drive high during normal work [ARM] pxa: save/restore PGSR on suspend/resume.
2009-05-29Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI Hotplug: acpiphp: don't store a pci_dev in acpiphp_func
2009-05-29Merge git://git.infradead.org/~dwmw2/mtd-2.6.30Linus Torvalds
* git://git.infradead.org/~dwmw2/mtd-2.6.30: jffs2: Fix corruption when flash erase/write failure mtd: MXC NAND driver fixes (v5)
2009-05-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: Revert "USB: Correct Makefile to make isp1760 buildable" usb-serial: fix crash when sub-driver updates firmware USB: isp1760: urb_dequeue doesn't always find the urbs USB: Yet another Conexant Clone to add to cdc-acm.c USB: atmel_usb_udc: Use kzalloc() to allocate ep structures USB: atmel-usba-udc : fix control out requests.
2009-05-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: Driver Core: do not oops when driver_unregister() is called for unregistered drivers sysfs: file.c: use create_singlethread_workqueue()
2009-05-29Merge branch 'for-2.6.30' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
* 'for-2.6.30' of git://linux-nfs.org/~bfields/linux: svcrdma: dma unmap the correct length for the RPCRDMA header page. nfsd: Revert "svcrpc: take advantage of tcp autotuning" nfsd: fix hung up of nfs client while sync write data to nfs server
2009-05-29Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: libps2 - better handle bad scheduler decisions Input: usb1400_ts - fix access to "device data" in resume function Input: multitouch - augment event semantics documentation Input: multitouch - add tracking ID to the protocol
2009-05-29Merge branch 'drm-intel-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel * 'drm-intel-next' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel: i915: Set object to gtt domain when faulting it back in drm/i915: Apply a big hammer to 865 GEM object CPU cache flushing. drm/i915: Fix tiling pitch handling on 8xx.
2009-05-29Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: ALSA: hda - Compaq Presario CQ60 patching for Conexant sound: usb-audio: make the MotU Fastlane work again ALSA: Enable PCM hw_ptr_jiffies check only in xrun_debug mode ALSA: Fix invalid jiffies check after pause