aboutsummaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2009-04-06perf_counter: add more context informationPeter Zijlstra
Put in counts to tell which ips belong to what context. ----- | | hv | -- nr | | kernel | -- | | user ----- Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Orig-LKML-Reference: <20090402091319.493101305@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: pmc arbitrationPeter Zijlstra
Follow the example set by powerpc and try to play nice with oprofile and the nmi watchdog. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Paul Mackerras <paulus@samba.org> Orig-LKML-Reference: <20090330171024.459968444@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: x86: callchain supportPeter Zijlstra
Provide the x86 perf_callchain() implementation. Code based on the ftrace/sysprof code from Soeren Sandmann Pedersen. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Paul Mackerras <paulus@samba.org> Cc: Soeren Sandmann Pedersen <sandmann@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <srostedt@redhat.com> Orig-LKML-Reference: <20090330171024.341993293@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: x86: proper error propagation for the x86 hw_perf_counter_init()Peter Zijlstra
Now that Paul cleaned up the error propagation paths, pass down the x86 error as well. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Paul Mackerras <paulus@samba.org> Orig-LKML-Reference: <20090330171023.792822360@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: make it possible for hw_perf_counter_init to return error codesPaul Mackerras
Impact: better error reporting At present, if hw_perf_counter_init encounters an error, all it can do is return NULL, which causes sys_perf_counter_open to return an EINVAL error to userspace. This isn't very informative for userspace; it means that userspace can't tell the difference between "sorry, oprofile is already using the PMU" and "we don't support this CPU" and "this CPU doesn't support the requested generic hardware event". This commit uses the PTR_ERR/ERR_PTR/IS_ERR set of macros to let hw_perf_counter_init return an error code on error rather than just NULL if it wishes. If it does so, that error code will be returned from sys_perf_counter_open to userspace. If it returns NULL, an EINVAL error will be returned to userspace, as before. This also adapts the powerpc hw_perf_counter_init to make use of this to return ENXIO, EINVAL, EBUSY, or EOPNOTSUPP as appropriate. It would be good to add extra error numbers in future to allow userspace to distinguish the various errors that are currently reported as EINVAL, i.e. irq_period < 0, too many events in a group, conflict between exclude_* settings in a group, and PMU resource conflict in a group. [ v2: fix a bug pointed out by Corey Ashford where error returns from hw_perf_counter_init were not handled correctly in the case of raw hardware events.] Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Orig-LKML-Reference: <20090330171023.682428180@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: powerpc: only reserve PMU hardware when we need itPaul Mackerras
Impact: cooperate with oprofile At present, on PowerPC, if you have perf_counters compiled in, oprofile doesn't work. There is code to allow the PMU to be shared between competing subsystems, such as perf_counters and oprofile, but currently the perf_counter subsystem reserves the PMU for itself at boot time, and never releases it. This makes perf_counter play nicely with oprofile. Now we keep a count of how many perf_counter instances are counting hardware events, and reserve the PMU when that count becomes non-zero, and release the PMU when that count becomes zero. This means that it is possible to have perf_counters compiled in and still use oprofile, as long as there are no hardware perf_counters active. This also means that if oprofile is active, sys_perf_counter_open will fail if the hw_event specifies a hardware event. To avoid races with other tasks creating and destroying perf_counters, we use a mutex. We use atomic_inc_not_zero and atomic_add_unless to avoid having to take the mutex unless there is a possibility of the count going between 0 and 1. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Orig-LKML-Reference: <20090330171023.627912475@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: unify and fix delayed counter wakeupPeter Zijlstra
While going over the wakeup code I noticed delayed wakeups only work for hardware counters but basically all software counters rely on them. This patch unifies and generalizes the delayed wakeup to fix this issue. Since we're dealing with NMI context bits here, use a cmpxchg() based single link list implementation to track counters that have pending wakeups. [ This should really be generic code for delayed wakeups, but since we cannot use cmpxchg()/xchg() in generic code, I've let it live in the perf_counter code. -- Eric Dumazet could use it to aggregate the network wakeups. ] Furthermore, the x86 method of using TIF flags was flawed in that its quite possible to end up setting the bit on the idle task, loosing the wakeup. The powerpc method uses per-cpu storage and does appear to be sufficient. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Paul Mackerras <paulus@samba.org> Orig-LKML-Reference: <20090330171023.153932974@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: record time running and time enabled for each counterPaul Mackerras
Impact: new functionality Currently, if there are more counters enabled than can fit on the CPU, the kernel will multiplex the counters on to the hardware using round-robin scheduling. That isn't too bad for sampling counters, but for counting counters it means that the value read from a counter represents some unknown fraction of the true count of events that occurred while the counter was enabled. This remedies the situation by keeping track of how long each counter is enabled for, and how long it is actually on the cpu and counting events. These times are recorded in nanoseconds using the task clock for per-task counters and the cpu clock for per-cpu counters. These values can be supplied to userspace on a read from the counter. Userspace requests that they be supplied after the counter value by setting the PERF_FORMAT_TOTAL_TIME_ENABLED and/or PERF_FORMAT_TOTAL_TIME_RUNNING bits in the hw_event.read_format field when creating the counter. (There is no way to change the read format after the counter is created, though it would be possible to add some way to do that.) Using this information it is possible for userspace to scale the count it reads from the counter to get an estimate of the true count: true_count_estimate = count * total_time_enabled / total_time_running This also lets userspace detect the situation where the counter never got to go on the cpu: total_time_running == 0. This functionality has been requested by the PAPI developers, and will be generally needed for interpreting the count values from counting counters correctly. In the implementation, this keeps 5 time values (in nanoseconds) for each counter: total_time_enabled and total_time_running are used when the counter is in state OFF or ERROR and for reporting back to userspace. When the counter is in state INACTIVE or ACTIVE, it is the tstamp_enabled, tstamp_running and tstamp_stopped values that are relevant, and total_time_enabled and total_time_running are determined from them. (tstamp_stopped is only used in INACTIVE state.) The reason for doing it like this is that it means that only counters being enabled or disabled at sched-in and sched-out time need to be updated. There are no new loops that iterate over all counters to update total_time_enabled or total_time_running. This also keeps separate child_total_time_running and child_total_time_enabled fields that get added in when reporting the totals to userspace. They are separate fields so that they can be atomic. We don't want to use atomics for total_time_running, total_time_enabled etc., because then we would have to use atomic sequences to update them, which are slower than regular arithmetic and memory accesses. It is possible to measure total_time_running by adding a task_clock counter to each group of counters, and total_time_enabled can be measured approximately with a top-level task_clock counter (though inaccuracies will creep in if you need to disable and enable groups since it is not possible in general to disable/enable the top-level task_clock counter simultaneously with another group). However, that adds extra overhead - I measured around 15% increase in the context switch latency reported by lat_ctx (from lmbench) when a task_clock counter was added to each of 2 groups, and around 25% increase when a task_clock counter was added to each of 4 groups. (In both cases a top-level task-clock counter was also added.) In contrast, the code added in this commit gives better information with no overhead that I could measure (in fact in some cases I measured lower times with this code, but the differences were all less than one standard deviation). [ v2: address review comments by Andrew Morton. ] Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrew Morton <akpm@linux-foundation.org> Orig-LKML-Reference: <18890.6578.728637.139402@cargo.ozlabs.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: new output ABI - part 1Peter Zijlstra
Impact: Rework the perfcounter output ABI use sys_read() only for instant data and provide mmap() output for all async overflow data. The first mmap() determines the size of the output buffer. The mmap() size must be a PAGE_SIZE multiple of 1+pages, where pages must be a power of 2 or 0. Further mmap()s of the same fd must have the same size. Once all maps are gone, you can again mmap() with a new size. In case of 0 extra pages there is no data output and the first page only contains meta data. When there are data pages, a poll() event will be generated for each full page of data. Furthermore, the output is circular. This means that although 1 page is a valid configuration, its useless, since we'll start overwriting it the instant we report a full page. Future work will focus on the output format (currently maintained) where we'll likey want each entry denoted by a header which includes a type and length. Further future work will allow to splice() the fd, also containing the async overflow data -- splice() would be mutually exclusive with mmap() of the data. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Orig-LKML-Reference: <20090323172417.470536358@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: add an mmap method to allow userspace to read hardware countersPaul Mackerras
Impact: new feature giving performance improvement This adds the ability for userspace to do an mmap on a hardware counter fd and get access to a read-only page that contains the information needed to translate a hardware counter value to the full 64-bit counter value that would be returned by a read on the fd. This is useful on architectures that allow user programs to read the hardware counters, such as PowerPC. The mmap will only succeed if the counter is a hardware counter monitoring the current process. On my quad 2.5GHz PowerPC 970MP machine, userspace can read a counter and translate it to the full 64-bit value in about 30ns using the mmapped page, compared to about 830ns for the read syscall on the counter, so this does give a significant performance improvement. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Orig-LKML-Reference: <20090323172417.297057964@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: remove the event config bitfieldsPeter Zijlstra
Since the bitfields turned into a bit of a mess, remove them and rely on good old masks. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Orig-LKML-Reference: <20090323172417.059499915@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: fix type/event_id layout on big-endian systemsPaul Mackerras
Impact: build fix for powerpc Commit db3a944aca35ae61 ("perf_counter: revamp syscall input ABI") expanded the hw_event.type field into a union of structs containing bitfields. In particular it introduced a type field and a raw_type field, with the intention that the 1-bit raw_type field should overlay the most-significant bit of the 8-bit type field, and in fact perf_counter_alloc() now assumes that (or at least, assumes that raw_type doesn't overlay any of the bits that are 1 in the values of PERF_TYPE_{HARDWARE,SOFTWARE,TRACEPOINT}). Unfortunately this is not true on big-endian systems such as PowerPC, where bitfields are laid out from left to right, i.e. from most significant bit to least significant. This means that setting hw_event.type = PERF_TYPE_SOFTWARE will set hw_event.raw_type to 1. This fixes it by making the layout depend on whether or not __BIG_ENDIAN_BITFIELD is defined. It's a bit ugly, but that's what we get for using bitfields in a user/kernel ABI. Also, that commit didn't fix up some places in arch/powerpc/kernel/ perf_counter.c where hw_event.raw and hw_event.event_id were used. This fixes them too. Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-04-06perf_counter: powerpc: clean up perc_counter_interruptPaul Mackerras
Impact: cleanup This updates the powerpc perf_counter_interrupt following on from the "perf_counter: unify irq output code" patch. Since we now use the generic perf_counter_output code, which sets the perf_counter_pending flag directly, we no longer need the need_wakeup variable. This removes need_wakeup and makes perf_counter_interrupt use get_perf_counter_pending() instead. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Orig-LKML-Reference: <20090319194234.024464535@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: unify irq output codePeter Zijlstra
Impact: cleanup Having 3 slightly different copies of the same code around does nobody any good. First step in revamping the output format. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> Orig-LKML-Reference: <20090319194233.929962222@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: revamp syscall input ABIPeter Zijlstra
Impact: modify ABI The hardware/software classification in hw_event->type became a little strained due to the addition of tracepoint tracing. Instead split up the field and provide a type field to explicitly specify the counter type, while using the event_id field to specify which event to use. Raw counters still work as before, only the raw config now goes into raw_event. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> Orig-LKML-Reference: <20090319194233.836807573@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: abstract wakeup flag setting in core to fix powerpc buildPaul Mackerras
Impact: build fix for powerpc Commit bd753921015e7905 ("perf_counter: software counter event infrastructure") introduced a use of TIF_PERF_COUNTERS into the core perfcounter code. This breaks the build on powerpc because we use a flag in a per-cpu area to signal wakeups on powerpc rather than a thread_info flag, because the thread_info flags have to be manipulated with atomic operations and are thus slower than per-cpu flags. This fixes the by changing the core to use an abstracted set_perf_counter_pending() function, which is defined on x86 to set the TIF_PERF_COUNTERS flag and on powerpc to set the per-cpu flag (paca->perf_counter_pending). It changes the previous powerpc definition of set_perf_counter_pending to not take an argument and adds a clear_perf_counter_pending, so as to simplify the definition on x86. On x86, set_perf_counter_pending() is defined as a macro. Defining it as a static inline in arch/x86/include/asm/perf_counters.h causes compile failures because <asm/perf_counters.h> gets included early in <linux/sched.h>, and the definitions of set_tsk_thread_flag etc. are therefore not available in <asm/perf_counters.h>. (On powerpc this problem is avoided by defining set_perf_counter_pending etc. in <asm/hw_irq.h>.) Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-04-06perf_counter: fix crash on perfmon v1 systemsIngo Molnar
Impact: fix boot crash on Intel Perfmon Version 1 systems Intel Perfmon v1 does not support the global MSRs, nor does it offer the generalized MSR ranges. So support v2 and later CPUs only. Also mark pmc_ops as read-mostly - to avoid false cacheline sharing. Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: provide major/minor page fault software eventsPeter Zijlstra
Provide separate sw counters for major and minor page faults. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: provide pagefault software eventsPeter Zijlstra
We use the generic software counter infrastructure to provide page fault events. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: x86: use ULL postfix for 64bit constantsPeter Zijlstra
Fix a build warning on 32bit machines by explicitly marking the constants as 64-bit. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: add comment to barrierPeter Zijlstra
We need to ensure the enabled=0 write happens before we start disabling the actual counters, so that a pcm_amd_enable() will not enable one underneath us. I think the race is impossible anyway, we always balance the ops within any one context and perform enable() with IRQs disabled. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: x86: fix 32-bit irq_period assumptionPeter Zijlstra
No need to assume the irq_period is 32bit. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06Merge branch 'linus' into perfcounters/core-v2Ingo Molnar
Merge reason: we have gathered quite a few conflicts, need to merge upstream Conflicts: arch/powerpc/kernel/Makefile arch/x86/ia32/ia32entry.S arch/x86/include/asm/hardirq.h arch/x86/include/asm/unistd_32.h arch/x86/include/asm/unistd_64.h arch/x86/kernel/cpu/common.c arch/x86/kernel/irq.c arch/x86/kernel/syscall_table_32.S arch/x86/mm/iomap_32.c include/linux/sched.h kernel/Makefile Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-05Merge branch 'for-next' of git://git.o-hand.com/linux-mfdLinus Torvalds
* 'for-next' of git://git.o-hand.com/linux-mfd: mfd: fix da903x warning mfd: fix MAINTAINERS entry mfd: Use the value of the final spin when reading the AUXADC mfd: Storage class should be before const qualifier mfd: PASIC3: supply clock_rate to DS1WM via driver_data mfd: remove DS1WM clock handling mfd: remove unused PASIC3 bus_shift field pxa/magician: remove deprecated .bus_shift from PASIC3 platform_data mfd: convert PASIC3 to use MFD core mfd: convert DS1WM to use MFD core mfd: Support active high IRQs on WM835x mfd: Use bulk read to fill WM8350 register cache mfd: remove duplicated #include from pcf50633
2009-04-05Merge branch 'for-linus' of git://repo.or.cz/cris-mirrorLinus Torvalds
* 'for-linus' of git://repo.or.cz/cris-mirror: CRISv32: Remove extraneous space between -I and the path. cris: convert obsolete hw_interrupt_type to struct irq_chip BUG to BUG_ON changes cpumask: use mm_cpumask() wrapper: cris cpumask: Use accessors code.: cris cpumask: prepare for iterators to only go to nr_cpu_ids/nr_cpumask_bits.: cris
2009-04-05Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (140 commits) ACPI: processor: use .notify method instead of installing handler directly ACPI: button: use .notify method instead of installing handler directly ACPI: support acpi_device_ops .notify methods toshiba-acpi: remove MAINTAINERS entry ACPI: battery: asynchronous init acer-wmi: Update copyright notice & documentation acer-wmi: Cleanup the failure cleanup handling acer-wmi: Blacklist Acer Aspire One video: build fix thinkpad-acpi: rework brightness support thinkpad-acpi: enhanced debugging messages for the fan subdriver thinkpad-acpi: enhanced debugging messages for the hotkey subdriver thinkpad-acpi: enhanced debugging messages for rfkill subdrivers thinkpad-acpi: restrict access to some firmware LEDs thinkpad-acpi: remove HKEY disable functionality thinkpad-acpi: add new debug helpers and warn of deprecated atts thinkpad-acpi: add missing log levels thinkpad-acpi: cleanup debug helpers thinkpad-acpi: documentation cleanup thinkpad-acpi: drop ibm-acpi alias ...
2009-04-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (23 commits) sh: sh7785lcr: Map whole PCI address space. sh: Fix up DSP context save/restore. sh: Fix up number of on-chip DMA channels on SH7091. sh: update defconfigs. sh: Kill off broken direct-mapped cache mode. sh: Wire up ARCH_HAS_DEFAULT_IDLE for cpuidle. sh: Add a command line option for disabling I/O trapping. sh: Select ARCH_HIBERNATION_POSSIBLE. sh: migor: Fix up CEU use flags. input: migor_ts: add wakeup support rtc: rtc-sh: use set_irq_wake() input: sh_keysc: use enable/disable_irq_wake() sh: intc: set_irq_wake() support sh: intc: install enable, disable and shutdown callbacks clocksource: sh_cmt: use remove_irq() and remove clockevent workaround sh: ap325 and Migo-R use new sh_mobile_ceu_info flags sh: Fix up -Wformat-security whining. sh: ap325rxa: Add ov772x support, again. sh: Sanitize asm/mmu.h for assembly use. sh: Tidy up sh7786 pinmux table. ...
2009-04-05Merge branch 'avr32-arch' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6 * 'avr32-arch' of git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6: avr32: add hardware handshake support to atmel_serial avr32: add RTS/CTS/CLK pin selection for the USARTs Add RTC support for Merisc boards avr32: at32ap700x: setup DMA for AC97C in the machine code avr32: at32ap700x: setup DMA for ABDAC in the machine code Add Merisc board support avr32: use gpio_is_valid() to check USBA vbus_pin I/O line atmel-usba-udc: use gpio_is_valid() to check vbus_pin I/O line avr32: fix timing LCD parameters for EVKLCD10X boards avr32: use GPIO line PB15 on EVKLCD10x boards for backlight avr32: configure MCI detect and write protect pins for EVKLCD10x boards avr32: set pin mask to alternative 18 bpp for EVKLCD10x boards avr32: add pin mask for 18-bit color on the LCD controller avr32: fix 15-bit LCDC pin mask to use MSB lines
2009-04-05Merge branch 'tracing-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (413 commits) tracing, net: fix net tree and tracing tree merge interaction tracing, powerpc: fix powerpc tree and tracing tree interaction ring-buffer: do not remove reader page from list on ring buffer free function-graph: allow unregistering twice trace: make argument 'mem' of trace_seq_putmem() const tracing: add missing 'extern' keywords to trace_output.h tracing: provide trace_seq_reserve() blktrace: print out BLK_TN_MESSAGE properly blktrace: extract duplidate code blktrace: fix memory leak when freeing struct blk_io_trace blktrace: fix blk_probes_ref chaos blktrace: make classic output more classic blktrace: fix off-by-one bug blktrace: fix the original blktrace blktrace: fix a race when creating blk_tree_root in debugfs blktrace: fix timestamp in binary output tracing, Text Edit Lock: cleanup tracing: filter fix for TRACE_EVENT_FORMAT events ftrace: Using FTRACE_WARN_ON() to check "freed record" in ftrace_release() x86: kretprobe-booster interrupt emulation code fix ... Fix up trivial conflicts in arch/parisc/include/asm/ftrace.h include/linux/memory.h kernel/extable.c kernel/module.c
2009-04-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumaskLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask: (36 commits) cpumask: remove cpumask allocation from idle_balance, fix numa, cpumask: move numa_node_id default implementation to topology.h, fix cpumask: remove cpumask allocation from idle_balance x86: cpumask: x86 mmio-mod.c use cpumask_var_t for downed_cpus x86: cpumask: update 32-bit APM not to mug current->cpus_allowed x86: microcode: cleanup x86: cpumask: use work_on_cpu in arch/x86/kernel/microcode_core.c cpumask: fix CONFIG_CPUMASK_OFFSTACK=y cpu hotunplug crash numa, cpumask: move numa_node_id default implementation to topology.h cpumask: convert node_to_cpumask_map[] to cpumask_var_t cpumask: remove x86 cpumask_t uses. cpumask: use cpumask_var_t in uv_flush_tlb_others. cpumask: remove cpumask_t assignment from vector_allocation_domain() cpumask: make Xen use the new operators. cpumask: clean up summit's send_IPI functions cpumask: use new cpumask functions throughout x86 x86: unify cpu_callin_mask/cpu_callout_mask/cpu_initialized_mask/cpu_sibling_setup_mask cpumask: convert struct cpuinfo_x86's llc_shared_map to cpumask_var_t cpumask: convert node_to_cpumask_map[] to cpumask_var_t x86: unify 32 and 64-bit node_to_cpumask_map ...
2009-04-05Merge ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-module-and-param * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-module-and-param: module: use strstarts() strstarts: helper function for !strncmp(str, prefix, strlen(prefix)) arm: allow usage of string functions in linux/string.h module: don't use stop_machine on module load module: create a request_module_nowait() module: include other structures in module version check module: remove the SHF_ALLOC flag on the __versions section. module: clarify the force-loading taint message. module: Export symbols needed for Ksplice Ksplice: Add functions for walking kallsyms symbols module: remove module_text_address() module: __module_address module: Make find_symbol return a struct kernel_symbol kernel/module.c: fix an unused goto label param: fix charp parameters set via sysfs Fix trivial conflicts in kernel/extable.c manually.
2009-04-05Merge branch 'linus' into releaseLen Brown
Conflicts: arch/x86/kernel/cpu/cpufreq/longhaul.c Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-05Merge branch 'misc' into releaseLen Brown
2009-04-05Merge branch 'x2apic' into releaseLen Brown
2009-04-05pxa/magician: remove deprecated .bus_shift from PASIC3 platform_dataPhilipp Zabel
The PASIC3 driver now calculates its register spacing from the resource size. Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
2009-04-04sh: sh7785lcr: Map whole PCI address space.Takashi Yoshii
PCI still doesn't work on sh7785lcr 29bit 256M map mode. On SH7785, PCI -> SHwy address translation is not base+offset but somewhat like base|offset (See HW Manual (rej09b0261) Fig. 13.11). So, you can't export CS2,3,4,5 by 256M at CS2 (results CS0,1,2,3 exported, I guess). There are two candidates. a) 128M@CS2 + 128M@CS4 b) 512M@CS0 Attached patch is B. It maps 512M Byte at 0 independently of memory size. It results CS0 to CS6 and perhaps some more being accessible from PCI. Tested on 7785lcr 29bit 128M map 7785lcr 29bit 256M map (NOT tested on 32bit) Signed-off-by: Takashi YOSHII <yoshii.takashi@renesas.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-04-04sh: Fix up DSP context save/restore.Michael Trimarchi
There were a number of issues with the DSP context save/restore code, mostly left-over relics from when it was introduced on SH3-DSP with little follow-up testing, resulting in things like task_pt_dspregs() referencing incorrect state on the stack. This follows the MIPS convention of tracking the DSP state in the thread_struct and handling the state save/restore in switch_to() and finish_arch_switch() respectively. The regset interface is also updated, which allows us to finally be rid of task_pt_dspregs() and the special cased task_pt_regs(). Signed-off-by: Michael Trimarchi <michael@evidence.eu.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-04-04sh: Fix up number of on-chip DMA channels on SH7091.Paul Mundt
This accidentally regressed when the multi-IRQ changes went in, switching SH7091 from 4 to 6 channels. Add SH7091 back in to the 4-channel dependency list. Reported-by: Adrian McMenamin <adrian@mcmen.demon.co.uk> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-04-03Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, mtrr: remove debug message x86: disable stack-protector for __restore_processor_state() x86: fix is_io_mapping_possible() build warning on i386 allnoconfig x86, setup: compile with -DDISABLE_BRANCH_PROFILING x86/dma: unify definition of pci_unmap_addr* and pci_unmap_len macros x86, mm: fix misuse of debug_kmap_atomic x86: remove duplicated code with pcpu_need_numa() x86,percpu: fix inverted NUMA test in setup_pcpu_remap() x86: signal: check sas_ss_size instead of sas_ss_flags()
2009-04-03Merge branch 'ipi-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: s390: remove arch specific smp_send_stop() panic: clean up kernel/panic.c panic, smp: provide smp_send_stop() wrapper on UP too panic: decrease oops_in_progress only after having done the panic generic-ipi: eliminate WARN_ON()s during oops/panic generic-ipi: cleanups generic-ipi: remove CSD_FLAG_WAIT generic-ipi: remove kmalloc() generic IPI: simplify barriers and locking
2009-04-03x86, ACPI: add support for x2apic ACPI extensionsSuresh Siddha
All logical processors with APIC ID values of 255 and greater will have their APIC reported through Processor X2APIC structure (type-9 entry type) and all logical processors with APIC ID less than 255 will have their APIC reported through legacy Processor Local APIC (type-0 entry type) only. This is the same case even for NMI structure reporting. The Processor X2APIC Affinity structure provides the association between the X2APIC ID of a logical processor and the proximity domain to which the logical processor belongs. For OSPM, Procssor IDs outside the 0-254 range are to be declared as Device() objects in the ACPI namespace. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-04x86, mtrr: remove debug messageIngo Molnar
The MTRR code grew a new debug message which triggers commonly: [ 40.142276] get_mtrr: cpu0 reg00 base=0000000000 size=0000080000 write-back [ 40.142280] get_mtrr: cpu0 reg01 base=0000080000 size=0000040000 write-back [ 40.142284] get_mtrr: cpu0 reg02 base=0000100000 size=0000040000 write-back [ 40.142311] get_mtrr: cpu0 reg00 base=0000000000 size=0000080000 write-back [ 40.142314] get_mtrr: cpu0 reg01 base=0000080000 size=0000040000 write-back [ 40.142317] get_mtrr: cpu0 reg02 base=0000100000 size=0000040000 write-back Remove this annoyance. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-03Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (28 commits) trivial: Update my email address trivial: NULL noise: drivers/mtd/tests/mtd_*test.c trivial: NULL noise: drivers/media/dvb/frontends/drx397xD_fw.h trivial: Fix misspelling of "Celsius". trivial: remove unused variable 'path' in alloc_file() trivial: fix a pdlfush -> pdflush typo in comment trivial: jbd header comment typo fix for JBD_PARANOID_IOFAIL trivial: wusb: Storage class should be before const qualifier trivial: drivers/char/bsr.c: Storage class should be before const qualifier trivial: h8300: Storage class should be before const qualifier trivial: fix where cgroup documentation is not correctly referred to trivial: Give the right path in Documentation example trivial: MTD: remove EOL from MODULE_DESCRIPTION trivial: Fix typo in bio_split()'s documentation trivial: PWM: fix of #endif comment trivial: fix typos/grammar errors in Kconfig texts trivial: Fix misspelling of firmware trivial: cgroups: documentation typo and spelling corrections trivial: Update contact info for Jochen Hein trivial: fix typo "resgister" -> "register" ...
2009-04-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/czankel/xtensa-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/czankel/xtensa-2.6: (21 commits) xtensa: we don't need to include asm/io.h xtensa: only build platform or variant if they contain a Makefile xtensa: make startup code discardable xtensa: ccount clocksource xtensa: remove platform rtc hooks xtensa: use generic sched_clock() xtensa: platform: s6105 xtensa: let platform override KERNELOFFSET xtensa: s6000 variant xtensa: s6000 variant core definitions xtensa: variant irq set callbacks xtensa: variant-specific code xtensa: nommu support xtensa: add flat support xtensa: enforce slab alignment to maximum register width xtensa: cope with ram beginning at higher addresses xtensa: don't make bootmem bitmap larger than required xtensa: fix init_bootmem_node() argument order xtensa: use correct stack pointer for stack traces xtensa: beat Kconfig into shape ...
2009-04-03x86, PAT: Remove duplicate memtype reserve in pci mmapSuresh Siddha
pci mmap code was doing memtype reserve for a while now. Recently we added memtype tracking in remap_pfn_range, and pci code indirectly calls remap_pfn_range. So, we don't need seperate tracking in pci code anymore. Which means a patch that removes ~50 lines of code :-). Also, recently we found out that the pci tracking is not working as we expect it to work in some cases. Specifically, userlevel X mmap of pci, with some recent version of X, is having a problem with vm_page_prot getting reset. The pci tracking uses vm_page_prot to pass on the protection type from parent to child during fork. a) Parent does a pci mmap b) We look at PAT and get either UC_MINUS or WC mapping for parent c) Store that mapping type in vma vm_page_prot for future use d) This thread does a fork e) Fork results in mmap_ops ->open for the child process f) We get the vm_page_prot from vma and reserve that type for the child process But, between c) and e) above, the vma vm_page_prot is getting reset to zero. This results in PAT reserve failing at the time of fork as in here. http://marc.info/?l=linux-kernel&m=123858163103240&w=2 This cleanup makes the above problem go away as we do not depend on vm_page_prot in our PAT code anymore. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-03x86: disable stack-protector for __restore_processor_state()Joseph Cihula
The __restore_processor_state() fn restores %gs on resume from S3. As such, it cannot be protected by the stack-protector guard since %gs will not be correct on function entry. There are only a few other fns in this file and it should not negatively impact kernel security that they will also have the stack-protector guard removed (and so it's not worth moving them to another file). Without this change, S3 resume on a kernel built with CONFIG_CC_STACKPROTECTOR_ALL=y will fail. Signed-off-by: Joseph Cihula <joseph.cihula@intel.com> Tested-by: Chris Wright <chrisw@sous-sol.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Tejun Heo <tj@kernel.org> LKML-Reference: <49D13385.5060900@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-03Merge git://git.infradead.org/iommu-2.6Linus Torvalds
* git://git.infradead.org/iommu-2.6: intel-iommu: Fix address wrap on 32-bit kernel. intel-iommu: Enable DMAR on 32-bit kernel. intel-iommu: fix PCI device detach from virtual machine intel-iommu: VT-d page table to support snooping control bit iommu: Add domain_has_cap iommu_ops intel-iommu: Snooping control support Fixed trivial conflicts in arch/x86/Kconfig and drivers/pci/intel-iommu.c
2009-04-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6: (23 commits) parisc: move dereference_function_descriptor to process.c parisc: Move kernel Elf_Fdesc define to <asm/elf.h> parisc: fix build when ARCH_HAS_KMAP parisc: fix "make tar-pkg" parisc: drivers: fix warnings parisc: select BUG always parisc: asm/pdc.h should include asm/page.h parisc: led: remove proc_dir_entry::owner parisc: fix macro expansion in atomic.h parisc: iosapic: fix build breakage parisc: oops_enter()/oops_exit() in die() parisc: document light weight syscall ABI parisc: blink all or loadavg LEDs on oops parisc: add ftrace (function and graph tracer) functionality parisc: simplify sys_clone() parisc: add LATENCYTOP_SUPPORT and CONFIG_STACKTRACE_SUPPORT parisc: allow to build with 16k default kernel page size parisc: expose 32/64-bit capabilities in cpuinfo parisc: use constants instead of numbers in assembly parisc: fix usage of 32bit PTE page table entries on 32bit kernels ...
2009-04-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/kyle/rtc-pariscLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/kyle/rtc-parisc: powerpc/ps3: Add rtc-ps3 powerpc: Hook up rtc-generic, and kill rtc-ppc m68k: Hook up rtc-generic parisc: rtc: Rename rtc-parisc to rtc-generic parisc: rtc: Add missing module alias parisc: rtc: platform_driver_probe() fixups parisc: rtc: get_rtc_time() returns unsigned int
2009-04-03mm: fix misuse of debug_kmap_atomicAkinobu Mita
Commit 7ca43e7564679604d86e9ed834e7bbcffd8a4a3f ("mm: use debug_kmap_atomic") introduced some debug_kmap_atomic() in wrong places. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>