aboutsummaryrefslogtreecommitdiff
path: root/arch/powerpc/kernel
AgeCommit message (Collapse)Author
2009-04-07perf_counter: theres more to overflow than writing eventsPeter Zijlstra
Prepare for more generic overflow handling. The new perf_counter_overflow() method will handle the generic bits of the counter overflow, and can return a !0 return value, in which case the counter should be (soft) disabled, so that it won't count until it's properly disabled. XXX: do powerpc and swcounter Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> LKML-Reference: <20090406094517.812109629@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: make it possible for hw_perf_counter_init to return error codesPaul Mackerras
Impact: better error reporting At present, if hw_perf_counter_init encounters an error, all it can do is return NULL, which causes sys_perf_counter_open to return an EINVAL error to userspace. This isn't very informative for userspace; it means that userspace can't tell the difference between "sorry, oprofile is already using the PMU" and "we don't support this CPU" and "this CPU doesn't support the requested generic hardware event". This commit uses the PTR_ERR/ERR_PTR/IS_ERR set of macros to let hw_perf_counter_init return an error code on error rather than just NULL if it wishes. If it does so, that error code will be returned from sys_perf_counter_open to userspace. If it returns NULL, an EINVAL error will be returned to userspace, as before. This also adapts the powerpc hw_perf_counter_init to make use of this to return ENXIO, EINVAL, EBUSY, or EOPNOTSUPP as appropriate. It would be good to add extra error numbers in future to allow userspace to distinguish the various errors that are currently reported as EINVAL, i.e. irq_period < 0, too many events in a group, conflict between exclude_* settings in a group, and PMU resource conflict in a group. [ v2: fix a bug pointed out by Corey Ashford where error returns from hw_perf_counter_init were not handled correctly in the case of raw hardware events.] Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Orig-LKML-Reference: <20090330171023.682428180@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: powerpc: only reserve PMU hardware when we need itPaul Mackerras
Impact: cooperate with oprofile At present, on PowerPC, if you have perf_counters compiled in, oprofile doesn't work. There is code to allow the PMU to be shared between competing subsystems, such as perf_counters and oprofile, but currently the perf_counter subsystem reserves the PMU for itself at boot time, and never releases it. This makes perf_counter play nicely with oprofile. Now we keep a count of how many perf_counter instances are counting hardware events, and reserve the PMU when that count becomes non-zero, and release the PMU when that count becomes zero. This means that it is possible to have perf_counters compiled in and still use oprofile, as long as there are no hardware perf_counters active. This also means that if oprofile is active, sys_perf_counter_open will fail if the hw_event specifies a hardware event. To avoid races with other tasks creating and destroying perf_counters, we use a mutex. We use atomic_inc_not_zero and atomic_add_unless to avoid having to take the mutex unless there is a possibility of the count going between 0 and 1. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Orig-LKML-Reference: <20090330171023.627912475@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: unify and fix delayed counter wakeupPeter Zijlstra
While going over the wakeup code I noticed delayed wakeups only work for hardware counters but basically all software counters rely on them. This patch unifies and generalizes the delayed wakeup to fix this issue. Since we're dealing with NMI context bits here, use a cmpxchg() based single link list implementation to track counters that have pending wakeups. [ This should really be generic code for delayed wakeups, but since we cannot use cmpxchg()/xchg() in generic code, I've let it live in the perf_counter code. -- Eric Dumazet could use it to aggregate the network wakeups. ] Furthermore, the x86 method of using TIF flags was flawed in that its quite possible to end up setting the bit on the idle task, loosing the wakeup. The powerpc method uses per-cpu storage and does appear to be sufficient. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Paul Mackerras <paulus@samba.org> Orig-LKML-Reference: <20090330171023.153932974@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: record time running and time enabled for each counterPaul Mackerras
Impact: new functionality Currently, if there are more counters enabled than can fit on the CPU, the kernel will multiplex the counters on to the hardware using round-robin scheduling. That isn't too bad for sampling counters, but for counting counters it means that the value read from a counter represents some unknown fraction of the true count of events that occurred while the counter was enabled. This remedies the situation by keeping track of how long each counter is enabled for, and how long it is actually on the cpu and counting events. These times are recorded in nanoseconds using the task clock for per-task counters and the cpu clock for per-cpu counters. These values can be supplied to userspace on a read from the counter. Userspace requests that they be supplied after the counter value by setting the PERF_FORMAT_TOTAL_TIME_ENABLED and/or PERF_FORMAT_TOTAL_TIME_RUNNING bits in the hw_event.read_format field when creating the counter. (There is no way to change the read format after the counter is created, though it would be possible to add some way to do that.) Using this information it is possible for userspace to scale the count it reads from the counter to get an estimate of the true count: true_count_estimate = count * total_time_enabled / total_time_running This also lets userspace detect the situation where the counter never got to go on the cpu: total_time_running == 0. This functionality has been requested by the PAPI developers, and will be generally needed for interpreting the count values from counting counters correctly. In the implementation, this keeps 5 time values (in nanoseconds) for each counter: total_time_enabled and total_time_running are used when the counter is in state OFF or ERROR and for reporting back to userspace. When the counter is in state INACTIVE or ACTIVE, it is the tstamp_enabled, tstamp_running and tstamp_stopped values that are relevant, and total_time_enabled and total_time_running are determined from them. (tstamp_stopped is only used in INACTIVE state.) The reason for doing it like this is that it means that only counters being enabled or disabled at sched-in and sched-out time need to be updated. There are no new loops that iterate over all counters to update total_time_enabled or total_time_running. This also keeps separate child_total_time_running and child_total_time_enabled fields that get added in when reporting the totals to userspace. They are separate fields so that they can be atomic. We don't want to use atomics for total_time_running, total_time_enabled etc., because then we would have to use atomic sequences to update them, which are slower than regular arithmetic and memory accesses. It is possible to measure total_time_running by adding a task_clock counter to each group of counters, and total_time_enabled can be measured approximately with a top-level task_clock counter (though inaccuracies will creep in if you need to disable and enable groups since it is not possible in general to disable/enable the top-level task_clock counter simultaneously with another group). However, that adds extra overhead - I measured around 15% increase in the context switch latency reported by lat_ctx (from lmbench) when a task_clock counter was added to each of 2 groups, and around 25% increase when a task_clock counter was added to each of 4 groups. (In both cases a top-level task-clock counter was also added.) In contrast, the code added in this commit gives better information with no overhead that I could measure (in fact in some cases I measured lower times with this code, but the differences were all less than one standard deviation). [ v2: address review comments by Andrew Morton. ] Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrew Morton <akpm@linux-foundation.org> Orig-LKML-Reference: <18890.6578.728637.139402@cargo.ozlabs.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: new output ABI - part 1Peter Zijlstra
Impact: Rework the perfcounter output ABI use sys_read() only for instant data and provide mmap() output for all async overflow data. The first mmap() determines the size of the output buffer. The mmap() size must be a PAGE_SIZE multiple of 1+pages, where pages must be a power of 2 or 0. Further mmap()s of the same fd must have the same size. Once all maps are gone, you can again mmap() with a new size. In case of 0 extra pages there is no data output and the first page only contains meta data. When there are data pages, a poll() event will be generated for each full page of data. Furthermore, the output is circular. This means that although 1 page is a valid configuration, its useless, since we'll start overwriting it the instant we report a full page. Future work will focus on the output format (currently maintained) where we'll likey want each entry denoted by a header which includes a type and length. Further future work will allow to splice() the fd, also containing the async overflow data -- splice() would be mutually exclusive with mmap() of the data. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Orig-LKML-Reference: <20090323172417.470536358@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: add an mmap method to allow userspace to read hardware countersPaul Mackerras
Impact: new feature giving performance improvement This adds the ability for userspace to do an mmap on a hardware counter fd and get access to a read-only page that contains the information needed to translate a hardware counter value to the full 64-bit counter value that would be returned by a read on the fd. This is useful on architectures that allow user programs to read the hardware counters, such as PowerPC. The mmap will only succeed if the counter is a hardware counter monitoring the current process. On my quad 2.5GHz PowerPC 970MP machine, userspace can read a counter and translate it to the full 64-bit value in about 30ns using the mmapped page, compared to about 830ns for the read syscall on the counter, so this does give a significant performance improvement. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Orig-LKML-Reference: <20090323172417.297057964@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: remove the event config bitfieldsPeter Zijlstra
Since the bitfields turned into a bit of a mess, remove them and rely on good old masks. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Orig-LKML-Reference: <20090323172417.059499915@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: fix type/event_id layout on big-endian systemsPaul Mackerras
Impact: build fix for powerpc Commit db3a944aca35ae61 ("perf_counter: revamp syscall input ABI") expanded the hw_event.type field into a union of structs containing bitfields. In particular it introduced a type field and a raw_type field, with the intention that the 1-bit raw_type field should overlay the most-significant bit of the 8-bit type field, and in fact perf_counter_alloc() now assumes that (or at least, assumes that raw_type doesn't overlay any of the bits that are 1 in the values of PERF_TYPE_{HARDWARE,SOFTWARE,TRACEPOINT}). Unfortunately this is not true on big-endian systems such as PowerPC, where bitfields are laid out from left to right, i.e. from most significant bit to least significant. This means that setting hw_event.type = PERF_TYPE_SOFTWARE will set hw_event.raw_type to 1. This fixes it by making the layout depend on whether or not __BIG_ENDIAN_BITFIELD is defined. It's a bit ugly, but that's what we get for using bitfields in a user/kernel ABI. Also, that commit didn't fix up some places in arch/powerpc/kernel/ perf_counter.c where hw_event.raw and hw_event.event_id were used. This fixes them too. Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-04-06perf_counter: powerpc: clean up perc_counter_interruptPaul Mackerras
Impact: cleanup This updates the powerpc perf_counter_interrupt following on from the "perf_counter: unify irq output code" patch. Since we now use the generic perf_counter_output code, which sets the perf_counter_pending flag directly, we no longer need the need_wakeup variable. This removes need_wakeup and makes perf_counter_interrupt use get_perf_counter_pending() instead. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Orig-LKML-Reference: <20090319194234.024464535@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: unify irq output codePeter Zijlstra
Impact: cleanup Having 3 slightly different copies of the same code around does nobody any good. First step in revamping the output format. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> Orig-LKML-Reference: <20090319194233.929962222@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: revamp syscall input ABIPeter Zijlstra
Impact: modify ABI The hardware/software classification in hw_event->type became a little strained due to the addition of tracepoint tracing. Instead split up the field and provide a type field to explicitly specify the counter type, while using the event_id field to specify which event to use. Raw counters still work as before, only the raw config now goes into raw_event. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> Orig-LKML-Reference: <20090319194233.836807573@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06perf_counter: abstract wakeup flag setting in core to fix powerpc buildPaul Mackerras
Impact: build fix for powerpc Commit bd753921015e7905 ("perf_counter: software counter event infrastructure") introduced a use of TIF_PERF_COUNTERS into the core perfcounter code. This breaks the build on powerpc because we use a flag in a per-cpu area to signal wakeups on powerpc rather than a thread_info flag, because the thread_info flags have to be manipulated with atomic operations and are thus slower than per-cpu flags. This fixes the by changing the core to use an abstracted set_perf_counter_pending() function, which is defined on x86 to set the TIF_PERF_COUNTERS flag and on powerpc to set the per-cpu flag (paca->perf_counter_pending). It changes the previous powerpc definition of set_perf_counter_pending to not take an argument and adds a clear_perf_counter_pending, so as to simplify the definition on x86. On x86, set_perf_counter_pending() is defined as a macro. Defining it as a static inline in arch/x86/include/asm/perf_counters.h causes compile failures because <asm/perf_counters.h> gets included early in <linux/sched.h>, and the definitions of set_tsk_thread_flag etc. are therefore not available in <asm/perf_counters.h>. (On powerpc this problem is avoided by defining set_perf_counter_pending etc. in <asm/hw_irq.h>.) Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-04-06Merge branch 'linus' into perfcounters/core-v2Ingo Molnar
Merge reason: we have gathered quite a few conflicts, need to merge upstream Conflicts: arch/powerpc/kernel/Makefile arch/x86/ia32/ia32entry.S arch/x86/include/asm/hardirq.h arch/x86/include/asm/unistd_32.h arch/x86/include/asm/unistd_64.h arch/x86/kernel/cpu/common.c arch/x86/kernel/irq.c arch/x86/kernel/syscall_table_32.S arch/x86/mm/iomap_32.c include/linux/sched.h kernel/Makefile Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-05Merge branch 'tracing-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (413 commits) tracing, net: fix net tree and tracing tree merge interaction tracing, powerpc: fix powerpc tree and tracing tree interaction ring-buffer: do not remove reader page from list on ring buffer free function-graph: allow unregistering twice trace: make argument 'mem' of trace_seq_putmem() const tracing: add missing 'extern' keywords to trace_output.h tracing: provide trace_seq_reserve() blktrace: print out BLK_TN_MESSAGE properly blktrace: extract duplidate code blktrace: fix memory leak when freeing struct blk_io_trace blktrace: fix blk_probes_ref chaos blktrace: make classic output more classic blktrace: fix off-by-one bug blktrace: fix the original blktrace blktrace: fix a race when creating blk_tree_root in debugfs blktrace: fix timestamp in binary output tracing, Text Edit Lock: cleanup tracing: filter fix for TRACE_EVENT_FORMAT events ftrace: Using FTRACE_WARN_ON() to check "freed record" in ftrace_release() x86: kretprobe-booster interrupt emulation code fix ... Fix up trivial conflicts in arch/parisc/include/asm/ftrace.h include/linux/memory.h kernel/extable.c kernel/module.c
2009-04-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/kyle/rtc-pariscLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/kyle/rtc-parisc: powerpc/ps3: Add rtc-ps3 powerpc: Hook up rtc-generic, and kill rtc-ppc m68k: Hook up rtc-generic parisc: rtc: Rename rtc-parisc to rtc-generic parisc: rtc: Add missing module alias parisc: rtc: platform_driver_probe() fixups parisc: rtc: get_rtc_time() returns unsigned int
2009-04-02Simplify copy_thread()Alexey Dobriyan
First argument unused since 2.3.11. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02workqueue: add to_delayed_work() helper functionJean Delvare
It is a fairly common operation to have a pointer to a work and to need a pointer to the delayed work it is contained in. In particular, all delayed works which want to rearm themselves will have to do that. So it would seem fair to offer a helper function for this operation. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Greg KH <greg@kroah.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02powerpc: Hook up rtc-generic, and kill rtc-ppcGeert Uytterhoeven
PowerPC has been a long time user of the generic RTC abstraction, so hook up rtc-generic: - Create the "rtc-generic" platform device if ppc_md.get_rtc_time is set, - Kill rtc-ppc, as rtc-generic offers the same functionality in a more generic way, and supports autoloading through udev. Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Acked-by: David Woodhouse <David.Woodhouse@intel.com> Acked-by: Alessandro Zummo <a.zummo@towertech.it> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
2009-04-02tracing, powerpc: fix powerpc tree and tracing tree interactionStephen Rothwell
Today's linux-next build (powerpc allyesconfig) failed like this: arch/powerpc/kernel/ftrace.c: In function 'prepare_ftrace_return': arch/powerpc/kernel/ftrace.c:612: warning: passing argument 3 of 'ftrace_push_return_trace' makes pointer from integer without a cast arch/powerpc/kernel/ftrace.c:612: error: too many arguments to function 'ftrace_push_return_trace' Caused by commit 5d1a03dc541dc6672e60e57249ed22f40654ca47 ("function-graph: moved the timestamp from arch to generic code") from the tracing tree which (removed an argument from ftrace_push_return_trace()) interacting with commit 6794c78243bfda020ab184d6d578944f8e90d26c ("powerpc64: port of the function graph tracer") from the powerpc tree. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: <linuxppc-dev@ozlabs.org> LKML-Reference: <20090327230834.93d0221d.sfr@canb.auug.org.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-01Merge branch 'linux-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (88 commits) PCI: fix HT MSI mapping fix PCI: don't enable too much HT MSI mapping x86/PCI: make pci=lastbus=255 work when acpi is on PCI: save and restore PCIe 2.0 registers PCI: update fakephp for bus_id removal PCI: fix kernel oops on bridge removal PCI: fix conflict between SR-IOV and config space sizing powerpc/PCI: include pci.h in powerpc MSI implementation PCI Hotplug: schedule fakephp for feature removal PCI Hotplug: rename legacy_fakephp to fakephp PCI Hotplug: restore fakephp interface with complete reimplementation PCI: Introduce /sys/bus/pci/devices/.../rescan PCI: Introduce /sys/bus/pci/devices/.../remove PCI: Introduce /sys/bus/pci/rescan PCI: Introduce pci_rescan_bus() PCI: do not enable bridges more than once PCI: do not initialize bridges more than once PCI: always scan child buses PCI: pci_scan_slot() returns newly found devices PCI: don't scan existing devices ... Fix trivial append-only conflict in Documentation/feature-removal-schedule.txt
2009-03-31proc 2/2: remove struct proc_dir_entry::ownerAlexey Dobriyan
Setting ->owner as done currently (pde->owner = THIS_MODULE) is racy as correctly noted at bug #12454. Someone can lookup entry with NULL ->owner, thus not pinning enything, and release it later resulting in module refcount underflow. We can keep ->owner and supply it at registration time like ->proc_fops and ->data. But this leaves ->owner as easy-manipulative field (just one C assignment) and somebody will forget to unpin previous/pin current module when switching ->owner. ->proc_fops is declared as "const" which should give some thoughts. ->read_proc/->write_proc were just fixed to not require ->owner for protection. rmmod'ed directories will be empty and return "." and ".." -- no harm. And directories with tricky enough readdir and lookup shouldn't be modular. We definitely don't want such modular code. Removing ->owner will also make PDE smaller. So, let's nuke it. Kudos to Jeff Layton for reminding about this, let's say, oversight. http://bugzilla.kernel.org/show_bug.cgi?id=12454 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2009-03-30Merge commit 'origin/master' into nextBenjamin Herrenschmidt
Manual merge of: arch/powerpc/include/asm/elf.h drivers/i2c/busses/i2c-mpc.c
2009-03-27Merge branch 'core/percpu' into percpu-cpumask-x86-for-linus-2Ingo Molnar
Conflicts: arch/parisc/kernel/irq.c arch/x86/include/asm/fixmap_64.h arch/x86/include/asm/setup.h kernel/irq/handle.c Semantic merge: arch/x86/include/asm/fixmap.h Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-27powerpc: Fix bugs introduced by sysfs changesBenjamin Herrenschmidt
Rusty's patch to change our sysfs access to various registers to use smp_call_function_single() introduced a whole bunch of warnings. This fixes them. This version also fixes an actual bug in here where it did mtspr instead of mfspr when reading the files Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-27powerpc: Sanitize stack pointer in signal handling codeJosh Boyer
On powerpc64 machines running 32-bit userspace, we can get garbage bits in the stack pointer passed into the kernel. Most places handle this correctly, but the signal handling code uses the passed value directly for allocating signal stack frames. This fixes the issue by introducing a get_clean_sp function that returns a sanitized stack pointer. For 32-bit tasks on a 64-bit kernel, the stack pointer is masked correctly. In all other cases, the stack pointer is simply returned. Additionally, we pass an 'is_32' parameter to get_sigframe now in order to get the properly sanitized stack. The callers are know to be 32 or 64-bit statically. Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-26Merge branch 'irq-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (32 commits) x86: disable __do_IRQ support sparseirq, powerpc/cell: fix unused variable warning in interrupt.c genirq: deprecate obsolete typedefs and defines genirq: deprecate __do_IRQ genirq: add doc to struct irqaction genirq: use kzalloc instead of explicit zero initialization genirq: make irqreturn_t an enum genirq: remove redundant if condition genirq: remove unused hw_irq_controller typedef irq: export remove_irq() and setup_irq() symbols irq: match remove_irq() args with setup_irq() irq: add remove_irq() for freeing of setup_irq() irqs genirq: assert that irq handlers are indeed running in hardirq context irq: name 'p' variables a bit better irq: further clean up the free_irq() code flow irq: refactor and clean up the free_irq() code flow irq: clean up manage.c irq: use GFP_KERNEL for action allocation in request_irq() kernel/irq: fix sparse warning: make symbol static irq: optimize init_kstat_irqs/init_copy_kstat_irqs ...
2009-03-25powerpc/PCI: include pci.h in powerpc MSI implementationJesse Barnes
This file uses PCI MSI defines and so needs pci.h. Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-24KVM: ppc: No need to include core-header for KVM in asm-offsets.c currentlyHollis Blanchard
Signed-off-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-03-24powerpc/mm: Introduce early_init_mmu() on 64-bitBenjamin Herrenschmidt
This moves some MMU related init code out of setup_64.c into hash_utils_64.c and calls it early_init_mmu() and early_init_mmu_secondary(). This will make it easier to plug in a new MMU type. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-24powerpc/mm: e300c2/c3/c4 TLB errata workaroundKumar Gala
Complete workaround for DTLB errata in e300c2/c3/c4 processors. Due to the bug, the hardware-implemented LRU algorythm always goes to way 1 of the TLB. This fix implements the proposed software workaround in form of a LRW table for chosing the TLB-way. Based on patch from David Jander <david@protonic.nl> Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-24powerpc/mm: Used free register to save a few cycles in SW TLB miss handlingKumar Gala
Now that r0 is free we can keep the value of I/DMISS in r3 and not reload it before doing the tlbli/d. This saves us a few cycles in the fast path case. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-24powerpc/mm: Remove unused register usage in SW TLB miss handlingKumar Gala
Long ago we had some code that actually used the CTR in the SW TLB miss handlers (603/e300). Since we don't use it no reason to waste cycles saving it off and restoring it (we actually didn't restore it in the fast path case). Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-24powerpc: setup default archdata for {of_}platform via bus_register_notifierKumar Gala
Since a number of powerpc chips are SoCs we end up having dma-able devices that are registered as platform or of_platform devices. We need to hook the archdata to setup proper dma_ops for these devices. Rather than having to add a bus_notify to each platform we add a default one at the highest priority (called first) to set the default dma_ops for of_platform and platform devices to dma_direct_ops. This allows platform code to override the ops by providing their own notifier call back. In the future to enable >4G DMA support on ppc32 we can hook swiotlb ops. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-24powerpc/pci: Default to dma_direct_ops for pci dma_opsKumar Gala
This will allow us to remove the ppc32 specific checks in get_dma_ops() that defaults to dma_direct_ops if the archdata is NULL. We really should always have archdata set to something going forward. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-24powerpc: Make sysfs code use smp_call_function_singleRusty Russell
Impact: performance improvement This fixes 'powerpc: avoid cpumask games in arch/powerpc/kernel/sysfs.c' which talked about using smp_call_function_single, but actually used work_on_cpu (an older version of the patch). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-24powerpc: Fix prom_init on 32-bit OF machinesBenjamin Herrenschmidt
Commit e7943fbbfdb6eef03c003b374de1f802cc14f02a broke ppc32 using Open Firmware client interface due to using the wrong relocation macro when accessing the variable "linux_banner". Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-24Merge commit 'origin/master' into nextBenjamin Herrenschmidt
2009-03-23powerpc/mm: Fix Respect _PAGE_COHERENT on classic ppc32 SW TLB load machinesKumar Gala
Grant picked up the wrong version of "Respect _PAGE_COHERENT on classic ppc32 SW" (commit a4bd6a93c3f14691c8a29e53eb04dc734b27f0db) It was missing the code to actually deal with the fixup of _PAGE_COHERENT based on the CPU feature. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2009-03-23Merge branches 'irq/sparseirq' and 'linus' into irq/coreIngo Molnar
2009-03-20PCI MSI: Add support for multiple MSIMatthew Wilcox
Add the new API pci_enable_msi_block() to allow drivers to request multiple MSI and reimplement pci_enable_msi in terms of pci_enable_msi_block. Ensure that the architecture back ends don't have to know about multiple MSI. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-17powerpc/mm: Respect _PAGE_COHERENT on classic ppc32 SWKumar Gala
Since we now set _PAGE_COHERENT in the Linux PTE we shouldn't be clearing it out before we setup the SW TLB. Today all the SW TLB machines (603/e300) that we support are non-SMP, however there are some errata on some devices that cause us to set _PAGE_COHERENT via CPU_FTR_NEED_COHERENT. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2009-03-16Merge branches 'irq/genirq' and 'linus' into irq/coreIngo Molnar
2009-03-11powerpc/kconfig: Kill PPC_MULTIPLATFORMBenjamin Herrenschmidt
CONFIG_PPC_MULTIPLATFORM is a remain of the pre-powerpc days and isn't really meaningful anymore. It was basically equivalent to PPC64 || 6xx. This removes it along with the following changes: - 32-bit platforms that relied on PPC32 && PPC_MULTIPLATFORM now rely on 6xx which is what they want anyway. - A new symbol, PPC_BOOK3S, is defined that represent compliance with the "Server" variant of the architecture. This is set when either 6xx or PPC64 is set and open the door for future BOOK3E 64-bit. - 64-bit platforms that relied on PPC64 && PPC_MULTIPLATFORM now use PPC64 && PPC_BOOK3S - A separate and selectable CONFIG_PPC_OF_BOOT_TRAMPOLINE option is now used to control the use of prom_init.c Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-11powerpc/irq: Convert obsolete irq_desc_t to struct irq_descThomas Gleixner
Impact: cleanup Convert the last remaining users. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: linuxppc-dev@ozlabs.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-11powerpc/udbg: Fix lost byte during console handover; change LFCR to CRLFAndrew Klossner
When the console is on a serial port to be driven by serial8250, a character can be lost from the end of the first line in the two-line sequence serial8250.0: ttyS0 at MMIO 0xe0004500 (irq = 42) is a 16550A console handover: boot [udbg0] -> real [ttyS0] This happens because udbg_puts or udbg_write stuff the last byte of the line into the Tx FIFO and return, whereupon the serial8250 initialization code immediately empties that FIFO. The fix: udbg_puts and udbg_write now wait for the Tx FIFO to clear before returning. This delays the system by one additional serial frame time for each line written by udbg, but the effect is not noticeable, a cumulative 17 milliseconds for 200 lines of early printk output at 115200 baud. Also, the routines in udbg_16550.c now emit CRLF instead of LFCR. Linux makes a point of emitting CRLF because, when serial output is captured to a file, LFCR sequences can confuse text editors. See http://lkml.org/lkml/2006/2/4/50 for some history. Signed-off-by: Andrew Klossner <andrew@cesa.opbu.xerox.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-11powerpc/pci: Fix typo: s/resouces/resources/ in a pr_debugWolfram Sang
Fix typo: s/resouces/resources/ in a pr_debug Signed-off-by: Wolfram Sang <w.sang@pengutronix.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-11powerpc: Print linux_banner in prom_initMichael Ellerman
So at least you can see what kernel you're booting if you die before the kernel prints it mid-way through start_kernel(). Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-11powerpc/oprofile: Enable support for ppc750 processorsOctavian Purdila
This patch enables oprofile for all 3 FX variants and GX variant of the 750 processor. Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-11powerpc: Remove unused asm-offsets entries for cpu_specMichael Ellerman
Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>