aboutsummaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2009-05-15perf_counter: allow arch to supply event misc flags and instruction pointerPaul Mackerras
At present the values we put in overflow events for the misc flags indicating processor mode and the instruction pointer are obtained using the standard user_mode() and instruction_pointer() functions. Those functions tell you where the performance monitor interrupt was taken, which might not be exactly where the counter overflow occurred, for example because interrupts were disabled at the point where the overflow occurred, or because the processor had many instructions in flight and chose to complete some more instructions beyond the one that caused the counter overflow. Some architectures (e.g. powerpc) can supply more precise information about where the counter overflow occurred and the processor mode at that point. This introduces new functions, perf_misc_flags() and perf_instruction_pointer(), which arch code can override to provide more precise information if available. They have default implementations which are identical to the existing code. This also adds a new misc flag value, PERF_EVENT_MISC_HYPERVISOR, for the case where a counter overflow occurred in the hypervisor. We encode the processor mode in the 2 bits previously used to indicate user or kernel mode; the values for user and kernel mode are unchanged and hypervisor mode is indicated by both bits being set. [ Impact: generalize perfcounter core facilities ] Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <18956.1272.818511.561835@cargo.ozlabs.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-15perf_counter: frequency based adaptive irq_periodPeter Zijlstra
Instead of specifying the irq_period for a counter, provide a target interrupt frequency and dynamically adapt the irq_period to match this frequency. [ Impact: new perf-counter attribute/feature ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <20090515132018.646195868@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-15perf_counter: per user mlock giftPeter Zijlstra
Instead of a per-process mlock gift for perf-counters, use a per-user gift so that there is less of a DoS potential. [ Impact: allow less worst-case unprivileged memory consumption ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <20090515132018.496182835@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-15perf_counter: Rework the perf counter disable/enablePeter Zijlstra
The current disable/enable mechanism is: token = hw_perf_save_disable(); ... /* do bits */ ... hw_perf_restore(token); This works well, provided that the use nests properly. Except we don't. x86 NMI/INT throttling has non-nested use of this, breaking things. Therefore provide a reference counter disable/enable interface, where the first disable disables the hardware, and the last enable enables the hardware again. [ Impact: refactor, simplify the PMU disable/enable logic ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-08perf_counter: add PERF_RECORD_CPUPeter Zijlstra
Allow recording the CPU number the event was generated on. RFC: this leaves a u32 as reserved, should we fill in the node_id() there, or leave this open for future extention, as userspace can already easily do the cpu->node mapping if needed. [ Impact: extend perfcounter output record format ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> LKML-Reference: <20090508170029.008627711@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-08perf_counter: add PERF_RECORD_CONFIGPeter Zijlstra
Much like CONFIG_RECORD_GROUP records the hw_event.config to identify the values, allow to record this for all counters. [ Impact: extend perfcounter output record format ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> LKML-Reference: <20090508170028.923228280@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-08perf_counter: rework ioctl()sPeter Zijlstra
Corey noticed that ioctl()s on grouped counters didn't work on the whole group. This extends the ioctl() interface to take a second argument that is interpreted as a flags field. We then provide PERF_IOC_FLAG_GROUP to toggle the behaviour. Having this flag gives the greatest flexibility, allowing you to individually enable/disable/reset counters in a group, or all together. [ Impact: fix group counter enable/disable semantics ] Reported-by: Corey Ashford <cjashfor@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20090508170028.837558214@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-06Merge branch 'core/locking' into perfcounters/coreIngo Molnar
Merge reason: we moved a mutex.h commit that originated from the perfcounters tree into core/locking - but now merge back that branch to solve a merge artifact and to pick up cleanups of this commit that happened in core/locking. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-05perf_counter: provide an mlock thresholdPeter Zijlstra
Provide a threshold to relax the mlock accounting, increasing usability. Each counter gets perf_counter_mlock_kb for free. [ Impact: allow more mmap buffering ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> LKML-Reference: <20090505155437.112113632@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-05perf_counter: add ioctl(PERF_COUNTER_IOC_RESET)Peter Zijlstra
Provide a way to reset an existing counter - this eases PAPI libraries around perfcounters. Similar to read() it doesn't collapse pending child counters. [ Impact: new perfcounter fd ioctl method to reset counters ] Suggested-by: Corey Ashford <cjashfor@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20090505155437.022272933@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-05perf_counter: uncouple data_head updates from wakeupsPeter Zijlstra
Keep data_head up-to-date irrespective of notifications. This fixes the case where you disable a counter and don't get a notification for the last few pending events, and it also allows polling usage. [ Impact: increase precision of perfcounter mmap-ed fields ] Suggested-by: Corey Ashford <cjashfor@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20090505155436.925084300@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-04perf_counter: initialize the per-cpu context earlierIngo Molnar
percpu scheduling for perfcounters wants to take the context lock, but that lock first needs to be initialized. Currently it is an early_initcall() - but that is too late, the task tick runs much sooner than that. Call it explicitly from the scheduler init sequence instead. [ Impact: fix access-before-init crash ] LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-01perf_counter: fix race in perf_output_*Peter Zijlstra
When two (or more) contexts output to the same buffer, it is possible to observe half written output. Suppose we have CPU0 doing perf_counter_mmap(), CPU1 doing perf_counter_overflow(). If CPU1 does a wakeup and exposes head to user-space, then CPU2 can observe the data CPU0 is still writing. [ Impact: fix occasionally corrupted profiling records ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> LKML-Reference: <20090501102533.007821627@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-30Merge branch 'core/signal' into perfcounters/coreThomas Gleixner
This is necessary to avoid the conflict of syscall numbers. Conflicts: arch/x86/ia32/ia32entry.S arch/x86/include/asm/unistd_32.h arch/x86/include/asm/unistd_64.h Fixes up the borked syscall numbers of perfcounters versus preadv/pwritev as well. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-04-30signals: implement sys_rt_tgsigqueueinfoThomas Gleixner
sys_kill has the per thread counterpart sys_tgkill. sigqueueinfo is missing a thread directed counterpart. Such an interface is important for migrating applications from other OSes which have the per thread delivery implemented. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Roland McGrath <roland@redhat.com> Acked-by: Ulrich Drepper <drepper@redhat.com>
2009-04-30mutex: add atomic_dec_and_mutex_lock(), fixAndrew Morton
include/linux/mutex.h:136: warning: 'mutex_lock' declared inline after being called include/linux/mutex.h:136: warning: previous declaration of 'mutex_lock' was here uninline it. [ Impact: clean up and uninline, address compiler warning ] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Eric Paris <eparis@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <200904292318.n3TNIsi6028340@imap1.linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (24 commits) e100: do not go D3 in shutdown unless system is powering off netfilter: revised locking for x_tables Bluetooth: Fix connection establishment with low security requirement Bluetooth: Add different pairing timeout for Legacy Pairing Bluetooth: Ensure that HCI sysfs add/del is preempt safe net: Avoid extra wakeups of threads blocked in wait_for_packet() net: Fix typo in net_device_ops description. ipv4: Limit size of route cache hash table Add reference to CAPI 2.0 standard Documentation/isdn/INTERFACE.CAPI update Documentation/isdn/00-INDEX ixgbe: Fix WoL functionality for 82599 KX4 devices veth: prevent oops caused by netdev destructor xfrm: wrong hash value for temporary SA forcedeth: tx timeout fix net: Fix LL_MAX_HEADER for CONFIG_TR_MODULE mlx4_en: Handle page allocation failure during receive mlx4_en: Fix cleanup flow on cq activation vlan: update vlan carrier state for admin up/down netfilter: xt_recent: fix stack overread in compat code ...
2009-04-29mutex: add atomic_dec_and_mutex_lock()Eric Paris
Much like the atomic_dec_and_lock() function in which we take an hold a spin_lock if we drop the atomic to 0 this function takes and holds the mutex if we dec the atomic to 0. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Orig-LKML-Reference: <20090323172417.410913479@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-29perf_counter, x86: consistent use of type int for counter indexRobert Richter
The type of counter index is sometimes implemented as unsigned int. This patch changes this to have a consistent usage of int. [ Impact: cleanup ] Signed-off-by: Robert Richter <robert.richter@amd.com> Cc: Paul Mackerras <paulus@samba.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1241002046-8832-21-git-send-email-robert.richter@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-29perfcounters: rename struct hw_perf_counter_ops into struct pmuRobert Richter
This patch renames struct hw_perf_counter_ops into struct pmu. It introduces a structure to describe a cpu specific pmu (performance monitoring unit). It may contain ops and data. The new name of the structure fits better, is shorter, and thus better to handle. Where it was appropriate, names of function and variable have been changed too. [ Impact: cleanup ] Signed-off-by: Robert Richter <robert.richter@amd.com> Cc: Paul Mackerras <paulus@samba.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1241002046-8832-7-git-send-email-robert.richter@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-29perf_counter, x86: declare perf_max_counters only for CONFIG_PERF_COUNTERSRobert Richter
This is only needed for CONFIG_PERF_COUNTERS enabled. [ Impact: cleanup ] Signed-off-by: Robert Richter <robert.richter@amd.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1241002046-8832-3-git-send-email-robert.richter@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-29Merge branch 'linus' into perfcounters/coreIngo Molnar
Merge reason: This brach was on -rc1, refresh it to almost-rc4 to pick up the latest upstream fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-28netfilter: revised locking for x_tablesStephen Hemminger
The x_tables are organized with a table structure and a per-cpu copies of the counters and rules. On older kernels there was a reader/writer lock per table which was a performance bottleneck. In 2.6.30-rc, this was converted to use RCU and the counters/rules which solved the performance problems for do_table but made replacing rules much slower because of the necessary RCU grace period. This version uses a per-cpu set of spinlocks and counters to allow to table processing to proceed without the cache thrashing of a global reader lock and keeps the same performance for table updates. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-28regulator: fix header file missing kernel-docRandy Dunlap
Add regulator header file missing kernel-doc: Warning(include/linux/regulator/driver.h:117): No description found for parameter 'set_mode' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> cc: Liam Girdwood <lrg@slimlogic.co.uk> cc: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
2009-04-28net: Avoid extra wakeups of threads blocked in wait_for_packet()Eric Dumazet
In 2.6.25 we added UDP mem accounting. This unfortunatly added a penalty when a frame is transmitted, since we have at TX completion time to call sock_wfree() to perform necessary memory accounting. This calls sock_def_write_space() and utimately scheduler if any thread is waiting on the socket. Thread(s) waiting for an incoming frame was scheduled, then had to sleep again as event was meaningless. (All threads waiting on a socket are using same sk_sleep anchor) This adds lot of extra wakeups and increases latencies, as noted by Christoph Lameter, and slows down softirq handler. Reference : http://marc.info/?l=linux-netdev&m=124060437012283&w=2 Fortunatly, Davide Libenzi recently added concept of keyed wakeups into kernel, and particularly for sockets (see commit 37e5540b3c9d838eb20f2ca8ea2eb8072271e403 epoll keyed wakeups: make sockets use keyed wakeups) Davide goal was to optimize epoll, but this new wakeup infrastructure can help non epoll users as well, if they care to setup an appropriate handler. This patch introduces new DEFINE_WAIT_FUNC() helper and uses it in wait_for_packet(), so that only relevant event can wakeup a thread blocked in this function. Trace of function calls from bnx2 TX completion bnx2_poll_work() is : __kfree_skb() skb_release_head_state() sock_wfree() sock_def_write_space() __wake_up_sync_key() __wake_up_common() receiver_wake_function() : Stops here since thread is waiting for an INPUT Reported-by: Christoph Lameter <cl@linux.com> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-27Remove unused support code for refok sections.Tim Abbott
The old refok sections .text.init.refok .data.init.refok .exit.text.refok have been deprecated since commit 312b1485fb509c9bc32eda28ad29537896658cb8. After the other patches in this patch series nothing is put in these sections, so clean things up by eliminating all the remaining references to them. Signed-off-by: Tim Abbott <tabbott@mit.edu> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-27Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: only save/restore existent registers in the PCIe capability x86/PCI: don't bother with root quirks if _CRS is used docbooks: add/fix PCI kernel-doc PCI: cleanup debug output resources x86/PCI: set_pci_bus_resources_arch_default cleanups x86/PCI: Move set_pci_bus_resources_arch_default into arch/x86 x86/PCI: don't call e820_all_mapped with -1 in the mmconfig case PCI quirk: disable MSI on VIA VT3364 chipsets
2009-04-27net: Fix typo in net_device_ops description.Mike Rapoport
Signed-off-by: Mike Rapoport <mike@compulab.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-27net: Fix LL_MAX_HEADER for CONFIG_TR_MODULEAdrian Bunk
Unless I miss anything this should fix a bug. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-26Add new HEAD_TEXT_SECTION macro.Tim Abbott
This patch is preparation for replacing all uses of ".head.text" or ".text.head" in the kernel with macros, so that the section name can later be changed without having to touch a lot of the kernel. Since some linker scripts do more complex things than referencing HEAD_TEXT, we add a HEAD_TEXT_SECTION macro that just contains the actual name. I've defined HEAD_TEXT_SECTION in a new header, include/linux/section-names.h, so that this section name only needs to appear in one place. I anticipate creating similar macro structures for a number of other section names. The long-term goal here is to be able to change the kernel's magic section names to those that are compatible with -ffunction-sections -fdata-sections. This requires renaming all magic sections with names of the form ".text.foo". Signed-off-by: Tim Abbott <tabbott@mit.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-25Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6
2009-04-24Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (34 commits) ACPI, i915: Register ACPI video even when not modesetting Revert "ACPICA: delete check for AML access to port 0x81-83" I/O port protection: update for windows compatibility. sony-laptop: always try to unblock rfkill on load sony-laptop: fix bogus error message display on resume ACPI: EC: Fix ACPI EC resume non-query interrupt message sony-laptop: SNC input event 38 fix sony-laptop: SNC 127 Initialization Fix sony-laptop: Duplicate SNC 127 Event Fix ACPI: prevent processor.max_cstate=0 boot crash ACPI/hpet: prevent boot hang when hpet=force used on ICH-4M ACPI: delete obsolete "bus master activity" proc field ACPI: idle: mark_tsc_unstable() at init-time, not run-time ACPI: add /sys/firmware/acpi/interrupts/sci_not counter ACPI video: fix an error when the brightness levels on AC and on Battery are same acpi-cpufreq: Do not let get_measured perf depend on internal variable acpi-cpufreq: style-only: add parens to math expression acpi-cpufreq: Cleanup: Use printk_once x86, acpi_cpufreq: Fix the NULL pointer dereference in get_measured_perf thinkpad-acpi: bump up version to 0.23 ...
2009-04-24Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: Fix potential inode allocation soft lockup in Orlov allocator ext4: Make the extent validity check more paranoid jbd: use SWRITE_SYNC_PLUG when writing synchronous revoke records jbd2: use SWRITE_SYNC_PLUG when writing synchronous revoke records ext4: really print the find_group_flex fallback warning only once
2009-04-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: USB: pwc : do not pass stack allocated buffers to USB core. USB: otg: Fix bug on remove path without transceiver USB: correct error handling in cdc-wdm USB: removal of tty->low_latency hack dating back to the old serial code USB: serial: sierra driver bug fix for composite interface USB: gadget: omap_udc uses platform_driver_probe() USB: ci13xxx_udc: fix build error USB: musb: Prevent multiple includes of musb.h USB: pass mem_flags to dma_alloc_coherent USB: g_file_storage: fix use-after-free bug when closing files USB: ehci-sched.c: EHCI SITD scheduling bugfix USB: fix mos7840 problem with minor numbers USB: mos7840: add new device id USB: musb: fix build when !CONFIG_PM USB: musb: Remove my email address from few musb related drivers USB: Gadget: MIPS CI13xxx UDC bugfixes USB: Unusual Device support for Gold MP3 Player Energy USB: serial: fix lifetime and locking problems
2009-04-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixesLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes: GFS2: Ensure that the inode goal block settings are updated GFS2: Fix bug in block allocation bitops: Add __ffs64 bitop
2009-04-24Merge branch 'kvm-updates/2.6.30' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
* 'kvm-updates/2.6.30' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: Unregister cpufreq notifier on unload KVM: x86: release time_page on vcpu destruction KVM: Fix overlapping check for memory slots KVM: MMU: disable global page optimization KVM: ia64: fix locking order entering guest KVM: MMU: Fix off-by-one calculating large page count
2009-04-24netfilter: nf_ct_dccp: add missing role attributes for DCCPPablo Neira Ayuso
This patch adds missing role attribute to the DCCP type, otherwise the creation of entries is not of any use. The attribute added is CTA_PROTOINFO_DCCP_ROLE which contains the role of the conntrack original tuple. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-04-24Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: cfq-iosched: cache prio_tree root in cfqq->p_root cfq-iosched: fix bug with aliased request and cooperation detection cfq-iosched: clear ->prio_trees[] on cfqd alloc block: fix intermittent dm timeout based oops umem: fix request_queue lock warning block: simplify I/O stat accounting pktcdvd.h should include mempool.h cfq-iosched: use the default seek distance when there aren't enough seek samples cfq-iosched: make seek_mean converge more quickly block: make blk_abort_queue() ignore non-request based devices block: include empty disks in /proc/diskstats bio: use bio_kmalloc() in copy/map functions bio: fix bio_kmalloc() block: fix queue bounce limit setting block: fix SG_IO vector request data length handling scatterlist: make sure sg_miter_next() doesn't return 0 sized mappings
2009-04-24Merge branch 'merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: powerpc: Fix modular build of ide-pmac when mediabay is built in powerpc/pasemi: Fix build error on UP powerpc: Make macintosh/mediabay driver depend on CONFIG_BLOCK maintainers: Fix PS3 patterns powerpc/ps3: Fix CONFIG_PS3_FLASH=n build warning powerpc/32: Don't clobber personality flags on exec powerpc: Fix crash on CPU hotplug powerpc/85xx: Remove defconfigs that mpc85xx_{smp_}defconfig cover powerpc/85xx: Added SMP defconfig powerpc/85xx: Enabled a bunch of FSL specific drivers/options powerpc/85xx: Updated generic mpc85xx_defconfig powerpc: don't disable SATA interrupts on Freescale MPC8610 HPCD fsl_rio: Pass the proper device to dma mapping routines powerpc: Fix of_node_put() exit path in of_irq_map_one() powerpc/5200: defconfig updates powerpc/5200: Add FLASH nodes to lite5200 device tree powerpc/device-tree: Document MTD nodes with multiple "reg" tuples powerpc/of-device-tree: Factor MTD physmap bindings out of booting-without-of powerpc/5200: Bring the legacy fsl_spi_platform_data hooks back
2009-04-24block: simplify I/O stat accountingJerome Marchand
This simplifies I/O stat accounting switching code and separates it completely from I/O scheduler switch code. Requests are accounted according to the state of their request queue at the time of the request allocation. There is no need anymore to flush the request queue when switching I/O accounting state. Signed-off-by: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-24pktcdvd.h should include mempool.hAlexander Beregalov
Fix this build error: In file included from fs/compat_ioctl.c:104: include/linux/pktcdvd.h:285: error: expected specifier-qualifier-list before 'mempool_t' Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-24Merge branch 'irq' into releaseLen Brown
2009-04-23USB: musb: Prevent multiple includes of musb.hMark A. Greer
Add #ifndef to musb header file to prevent multiple inclusions. Signed-off-by: Mark A. Greer <mgreer@mvista.com> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-04-23bitops: Add __ffs64 bitopSteven Whitehouse
Finds the first set bit in a 64 bit word. This is required in order to fix a bug in GFS2, but I think it should be a generic function in case of future users. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Reviewed-by: Christoph Lameter <cl@linux.com> Reviewed-by: Willy Tarreau <w@1wt.eu>
2009-04-22PCI: only save/restore existent registers in the PCIe capabilityYu Zhao
PCIe 1.1 base neither requires the endpoint to implement the entire PCIe capability structure nor specifies default values of registers that are not implemented by the device. So we only save and restore registers that must be implemented by different device types if the device PCIe capability version is 1. PCIe 1.1 Capability Structure Expansion ECN and PCIe 2.0 requires all registers in the PCIe capability to be either implemented or hardwired to 0. Their PCIe capability version is 2. Signed-off-by: Yu Zhao <yu.zhao@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-04-22KVM: Fix overlapping check for memory slotsJan Kiszka
When checking for overlapping slots on registration of a new one, kvm currently also considers zero-length (ie. deleted) slots and rejects requests incorrectly. This finally denies user space from joining slots. Fix the check by skipping deleted slots and advertise this via a KVM_CAP_JOIN_MEMORY_REGIONS_WORKS. Cc: stable@kernel.org Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2009-04-22block: include empty disks in /proc/diskstatsTejun Heo
/proc/diskstats used to show stats for all disks whether they're zero-sized or not and their non-zero partitions. Commit 074a7aca7afa6f230104e8e65eba3420263714a5 accidentally changed the behavior such that it doesn't print out zero sized disks. This patch implements DISK_PITER_INCL_EMPTY_PART0 flag to partition iterator and uses it in diskstats_show() such that empty part0 is shown in /proc/diskstats. Reported and bisectd by Dianel Collins. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Daniel Collins <solemnwarning@solemnwarning.no-ip.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-22bio: fix bio_kmalloc()Tejun Heo
Impact: fix bio_kmalloc() and its destruction path bio_kmalloc() was broken in two ways. * bvec_alloc_bs() first allocates bvec using kmalloc() and then ignores it and allocates again like non-kmalloc bvecs. * bio_kmalloc_destructor() didn't check for and free bio integrity data. This patch fixes the above problems. kmalloc patch is separated out from bio_alloc_bioset() and allocates the requested number of bvecs as inline bvecs. * bio_alloc_bioset() no longer takes NULL @bs. None other than bio_kmalloc() used it and outside users can't know how it was allocated anyway. * Define and use BIO_POOL_NONE so that pool index check in bvec_free_bs() triggers if inline or kmalloc allocated bvec gets there. * Relocate destructors on top of each allocation function so that how they're used is more clear. Jens Axboe suggested allocating bvecs inline. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-22Merge branch 'merge' of git://git.secretlab.ca/git/linux-2.6 into mergePaul Mackerras
2009-04-21driver synchronization: make scsi_wait_scan more advancedArjan van de Ven
There is currently only one way for userspace to say "wait for my storage device to get ready for the modules I just loaded": to load the scsi_wait_scan module. Expectations of userspace are that once this module is loaded, all the (storage) devices for which the drivers were loaded before the module load are present. Now, there are some issues with the implementation, and the async stuff got caught in the middle of this: The existing code only waits for the scsy async probing to finish, but it did not take into account at all that probing might not have begun yet. (Russell ran into this problem on his computer and the fix works for him) This patch fixes this more thoroughly than the previous "fix", which had some bad side effects (namely, for kernel code that wanted to wait for the scsi scan it would also do an async sync, which would deadlock if you did it from async context already.. there's a report about that on lkml): The patch makes the module first wait for all device driver probes, and then it will wait for the scsi parallel scan to finish. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Tested-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>