aboutsummaryrefslogtreecommitdiff
path: root/lib
AgeCommit message (Collapse)Author
2009-01-06kobject: return the result of uevent sending by netlinkMing Lei
We need to return the result of uevent sending by netlink to caller, when uevent_helper is disabled and CONFIG_NET is defined. Signed-off-by: Ming Lei <tom.leiming@gmail.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-06uevent: don't pass envp_ext[] as format string in kobject_uevent_env()Tejun Heo
kobject_uevent_env() uses envp_ext[] as verbatim format string which can cause problems ranging from unexpectedly mangled string to oops if a string in envp_ext[] contains substring which can be interpreted as format. Fix it. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-06driver core: Remove completion from struct klist_nodeMatthew Wilcox
Removing the completion from klist_node reduces its size from 64 bytes to 28 on x86-64. To maintain the semantics of klist_remove(), we add a single list of klist nodes which are pending deletion and scan them. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-06trivial: radix-tree: document wrap-around issue of radix_tree_next_hole()Wu Fengguang
And some 80-line cleanups. Signed-off-by: Wu Fengguang <wfg@linux.intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-01-06Merge branches 'core/futexes', 'core/locking', 'core/rcu' and 'linus' into ↵Ingo Molnar
core/urgent
2009-01-05Merge branch 'core/iommu' into core/urgentIngo Molnar
Conflicts: lib/swiotlb.c
2009-01-04swiotlb: Don't include linux/swiotlb.h twice in lib/swiotlb.cJesper Juhl
There's no point in including the linux/swiotlb.h header twice in lib/swiotlb.c - this patch gets rid of the unneeded include. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-04Merge branch 'linus' into core/urgentIngo Molnar
2009-01-03Merge branch 'cpus4096-for-linus-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'cpus4096-for-linus-3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (77 commits) x86: setup_per_cpu_areas() cleanup cpumask: fix compile error when CONFIG_NR_CPUS is not defined cpumask: use alloc_cpumask_var_node where appropriate cpumask: convert shared_cpu_map in acpi_processor* structs to cpumask_var_t x86: use cpumask_var_t in acpi/boot.c x86: cleanup some remaining usages of NR_CPUS where s/b nr_cpu_ids sched: put back some stack hog changes that were undone in kernel/sched.c x86: enable cpus display of kernel_max and offlined cpus ia64: cpumask fix for is_affinity_mask_valid() cpumask: convert RCU implementations, fix xtensa: define __fls mn10300: define __fls m32r: define __fls h8300: define __fls frv: define __fls cris: define __fls cpumask: CONFIG_DISABLE_OBSOLETE_CPUMASK_FUNCTIONS cpumask: zero extra bits in alloc_cpumask_var_node cpumask: replace for_each_cpu_mask_nr with for_each_cpu in kernel/time/ cpumask: convert mm/ ...
2009-01-03Make %p print '(null)' for NULL pointersLinus Torvalds
Before, when we only ever printed out the pointer value itself, a NULL pointer would never cause issues and might as well be printed out as just its numeric value. However, with the extended %p formats, especially %pR, we might validly want to print out resources for debugging. And sometimes they don't even exist, and the resource pointer is just NULL. Print it out as such, rather than oopsing. This is a more generic version of a patch done by Trent Piepho (catching all %p cases rather than just %pR, and using "(null)" instead of "[NULL]" to match glibc). Requested-by: Trent Piepho <xyzzy@speakeasy.org> Acked-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-02swiotlb: add missing __init annotationsRoland Dreier
Impact: cleanup, reduce kernel size a bit The current kernel build warns: WARNING: vmlinux.o(.text+0x11458): Section mismatch in reference from the function swiotlb_alloc_boot() to the function .init.text:__alloc_bootmem_low() The function swiotlb_alloc_boot() references the function __init __alloc_bootmem_low(). This is often because swiotlb_alloc_boot lacks a __init annotation or the annotation of __alloc_bootmem_low is wrong. WARNING: vmlinux.o(.text+0x1011f2): Section mismatch in reference from the function swiotlb_late_init_with_default_size() to the function .init.text:__alloc_bootmem_low() The function swiotlb_late_init_with_default_size() references the function __init __alloc_bootmem_low(). This is often because swiotlb_late_init_with_default_size lacks a __init annotation or the annotation of __alloc_bootmem_low is wrong. and indeed the functions calling __alloc_bootmem_low() can be marked __init as well. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-02Merge branch 'cpus4096-for-linus-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'cpus4096-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (66 commits) x86: export vector_used_by_percpu_irq x86: use logical apicid in x2apic_cluster's x2apic_cpu_mask_to_apicid_and() sched: nominate preferred wakeup cpu, fix x86: fix lguest used_vectors breakage, -v2 x86: fix warning in arch/x86/kernel/io_apic.c sched: fix warning in kernel/sched.c sched: move test_sd_parent() to an SMP section of sched.h sched: add SD_BALANCE_NEWIDLE at MC and CPU level for sched_mc>0 sched: activate active load balancing in new idle cpus sched: bias task wakeups to preferred semi-idle packages sched: nominate preferred wakeup cpu sched: favour lower logical cpu number for sched_mc balance sched: framework for sched_mc/smt_power_savings=N sched: convert BALANCE_FOR_xx_POWER to inline functions x86: use possible_cpus=NUM to extend the possible cpus allowed x86: fix cpu_mask_to_apicid_and to include cpu_online_mask x86: update io_apic.c to the new cpumask code x86: Introduce topology_core_cpumask()/topology_thread_cpumask() x86: xen: use smp_call_function_many() x86: use work_on_cpu in x86/kernel/cpu/mcheck/mce_amd_64.c ... Fixed up trivial conflict in kernel/time/tick-sched.c manually
2009-01-01cpumask: CONFIG_DISABLE_OBSOLETE_CPUMASK_FUNCTIONSRusty Russell
Impact: new debug CONFIG options This helps find unconverted code. It currently breaks compile horribly, but we never wanted a flag day so that's expected. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-01-01cpumask: zero extra bits in alloc_cpumask_var_nodeRusty Russell
Impact: extra safety checks during transition When CONFIG_CPUMASKS_OFFSTACK is set, the new cpumask_ operators only use bits up to nr_cpu_ids, not NR_CPUS. Using the old cpus_ operators on these masks can mean accessing undefined bits. After some discussion, Mike and I decided to err on the side of caution; we zero the "undefined" bits in alloc_cpumask_var_node() until all the old cpumask functions are removed. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-01-01bitmap: find_last_bit()Rusty Russell
Impact: New API As the name suggests. For the moment everyone uses the generic one. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-01-01cpumask: fix bogus kernel-docLi Zefan
Impact: fix kernel-doc alloc_bootmem_cpumask_var() returns avoid. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-12-31Merge branch 'master' of ↵Rusty Russell
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: arch/x86/kernel/io_apic.c
2008-12-30Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6: slub: avoid leaking caches or refcounts on sysfs error slab: Fix comment on #endif slab: remove GFP_THISNODE clearing from alloc_slabmgmt() slub: Add might_sleep_if() to slab_alloc() SLUB: failslab support slub: Fix incorrect use of loose slab: Update the kmem_cache_create documentation regarding the name parameter slub: make early_kmem_cache_node_alloc void slab: unsigned slabp->inuse cannot be less than 0 slub - fix get_object_page comment SLUB: Replace __builtin_return_address(0) with _RET_IP_. SLUB: cleanup - define macros instead of hardcoded numbers
2008-12-30Merge branch 'core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (63 commits) stacktrace: provide save_stack_trace_tsk() weak alias rcu: provide RCU options on non-preempt architectures too printk: fix discarding message when recursion_bug futex: clean up futex_(un)lock_pi fault handling "Tree RCU": scalable classic RCU implementation futex: rename field in futex_q to clarify single waiter semantics x86/swiotlb: add default swiotlb_arch_range_needs_mapping x86/swiotlb: add default phys<->bus conversion x86: unify pci iommu setup and allow swiotlb to compile for 32 bit x86: add swiotlb allocation functions swiotlb: consolidate swiotlb info message printing swiotlb: support bouncing of HighMem pages swiotlb: factor out copy to/from device swiotlb: add arch hook to force mapping swiotlb: allow architectures to override phys<->bus<->phys conversions swiotlb: add comment where we handle the overflow of a dma mask on 32 bit rcu: fix rcutorture behavior during reboot resources: skip sanity check of busy resources swiotlb: move some definitions to header swiotlb: allow architectures to override swiotlb pool allocation ... Fix up trivial conflicts in arch/x86/kernel/Makefile arch/x86/mm/init_32.c include/linux/hardirq.h as per Ingo's suggestions.
2008-12-30Merge branch 'master' of ↵Rusty Russell
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
2008-12-29locking, percpu counters: introduce separate lock classesPeter Zijlstra
Impact: fix lockdep false positives Classify percpu_counter instances similar to regular lock objects -- that is, per instantiation site. The networking code has increased its use of percpu_counters, which leads to false positives if they are treated as a single class. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-29SLUB: failslab supportAkinobu Mita
Currently fault-injection capability for SLAB allocator is only available to SLAB. This patch makes it available to SLUB, too. [penberg@cs.helsinki.fi: unify slab and slub implementations] Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Matt Mackall <mpm@selenic.com> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2008-12-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1429 commits) net: Allow dependancies of FDDI & Tokenring to be modular. igb: Fix build warning when DCA is disabled. net: Fix warning fallout from recent NAPI interface changes. gro: Fix potential use after free sfc: If AN is enabled, always read speed/duplex from the AN advertising bits sfc: When disabling the NIC, close the device rather than unregistering it sfc: SFT9001: Add cable diagnostics sfc: Add support for multiple PHY self-tests sfc: Merge top-level functions for self-tests sfc: Clean up PHY mode management in loopback self-test sfc: Fix unreliable link detection in some loopback modes sfc: Generate unique names for per-NIC workqueues 802.3ad: use standard ethhdr instead of ad_header 802.3ad: generalize out mac address initializer 802.3ad: initialize ports LACPDU from const initializer 802.3ad: remove typedef around ad_system 802.3ad: turn ports is_individual into a bool 802.3ad: turn ports is_enabled into a bool 802.3ad: make ntt bool ixgbe: Fix set_ringparam in ixgbe to use the same memory pools. ... Fixed trivial IPv4/6 address printing conflicts in fs/cifs/connect.c due to the conversion to %pI (in this networking merge) and the addition of doing IPv6 addresses (from the earlier merge of CIFS).
2008-12-28Merge branch 'x86-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (246 commits) x86: traps.c replace #if CONFIG_X86_32 with #ifdef CONFIG_X86_32 x86: PAT: fix address types in track_pfn_vma_new() x86: prioritize the FPU traps for the error code x86: PAT: pfnmap documentation update changes x86: PAT: move track untrack pfnmap stubs to asm-generic x86: PAT: remove follow_pfnmap_pte in favor of follow_phys x86: PAT: modify follow_phys to return phys_addr prot and return value x86: PAT: clarify is_linear_pfn_mapping() interface x86: ia32_signal: remove unnecessary declaration x86: common.c boot_cpu_stack and boot_exception_stacks should be static x86: fix intel x86_64 llc_shared_map/cpu_llc_id anomolies x86: fix warning in arch/x86/kernel/microcode_amd.c x86: ia32.h: remove unused struct sigfram32 and rt_sigframe32 x86: asm-offset_64: use rt_sigframe_ia32 x86: sigframe.h: include headers for dependency x86: traps.c declare functions before they get used x86: PAT: update documentation to cover pgprot and remap_pfn related changes - v3 x86: PAT: add pgprot_writecombine() interface for drivers - v3 x86: PAT: change pgprot_noncached to uc_minus instead of strong uc - v3 x86: PAT: implement track/untrack of pfnmap regions for x86 - v3 ...
2008-12-28Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (105 commits) SELinux: don't check permissions for kernel mounts security: pass mount flags to security_sb_kern_mount() SELinux: correctly detect proc filesystems of the form "proc/foo" Audit: Log TIOCSTI user namespaces: document CFS behavior user namespaces: require cap_set{ug}id for CLONE_NEWUSER user namespaces: let user_ns be cloned with fairsched CRED: fix sparse warnings User namespaces: use the current_user_ns() macro User namespaces: set of cleanups (v2) nfsctl: add headers for credentials coda: fix creds reference capabilities: define get_vfs_caps_from_disk when file caps are not enabled CRED: Allow kernel services to override LSM settings for task actions CRED: Add a kernel_service object class to SELinux CRED: Differentiate objective and effective subjective credentials on a task CRED: Documentation CRED: Use creds in file structs CRED: Prettify commoncap.c CRED: Make execve() take advantage of copy-on-write credentials ...
2008-12-28swiotlb: clean up EXPORT_SYMBOL usageFUJITA Tomonori
Impact: cleanup swiotlb uses EXPORT_SYMBOL in an inconsistent way. Some functions use EXPORT_SYMBOL at the end of functions. Some use it at the end of swiotlb.c. This cleans up swiotlb to use EXPORT_SYMBOL in a consistent way (at the end of functions). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28swiotlb: remove unnecessary declarationFUJITA Tomonori
Impact: cleanup Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28swiotlb: add support for systems with highmemBecky Bruce
Impact: extend code for highmem - existing users unaffected On highmem systems, the original dma buffer might not have a virtual mapping - we need to kmap it in to perform the bounce. Extract the code that does the actual copy into a function that does the kmap if highmem is enabled, and default to the normal swiotlb memcpy if not. [ ported by Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> ] Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28swiotlb: store phys address in io_tlb_orig_addr arrayBecky Bruce
Impact: refactor code, cleanup When we enable swiotlb for platforms that support HIGHMEM, we can no longer store the virtual address of the original dma buffer, because that buffer might not have a permament mapping. Change the swiotlb code to instead store the physical address of the original buffer. Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28swiotlb: add hwdev to swiotlb_phys_to_bus() / swiotlb_sg_to_bus()Jeremy Fitzhardinge
Impact: extend functions with a (yet unused) parameter, update callsites Some architectures need it - in preparation for highmem swiotlb. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-25Merge branches 'core/debugobjects', 'core/iommu', 'core/locking', ↵Ingo Molnar
'core/printk', 'core/rcu', 'core/resources', 'core/softirq' and 'core/stacktrace' into core/core
2008-12-25Merge commit 'v2.6.28' into core/coreIngo Molnar
2008-12-25Merge branch 'next' into for-linusJames Morris
2008-12-25libcrc32c: Select CRYPTO in KconfigHerbert Xu
Selecting CRYPTO_CRC32C is not enough as CRYPTO which CRYPTO_CRC32C depends on may be disabled. This patch adds the select on CRYPTO. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-12-25libcrc32c: Fix "crc32c undefined" compilation errorAdrian-Ken Rueegsegger
The latest shash changes leave crc32c undefined: [...] Building modules, stage 2. MODPOST 1381 modules ERROR: "crc32c" [net/sctp/sctp.ko] undefined! ERROR: "crc32c" [net/ipv4/netfilter/nf_nat_proto_sctp.ko] undefined! Adding EXPORT_SYMBOL(crc32c) to lib/libcrc32c.c fixes the compile error. This patch has been compile-tested only. Signed-off-by: Adrian-Ken Rueegsegger <rueegsegger@swiss-it.ch> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-12-25libcrc32c: Move implementation to crypto crc32cHerbert Xu
This patch swaps the role of libcrc32c and crc32c. Previously the implementation was in libcrc32c and crc32c was a wrapper. Now the code is in crc32c and libcrc32c just calls the crypto layer. The reason for the change is to tap into the algorithm selection capability of the crypto API so that optimised implementations such as the one utilising Intel's CRC32C instruction can be used where available. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-12-23Merge branches 'x86/apic', 'x86/cleanups', 'x86/cpufeature', ↵Ingo Molnar
'x86/crashdump', 'x86/debug', 'x86/defconfig', 'x86/detect-hyper', 'x86/doc', 'x86/dumpstack', 'x86/early-printk', 'x86/fpu', 'x86/idle', 'x86/io', 'x86/memory-corruption-check', 'x86/microcode', 'x86/mm', 'x86/mtrr', 'x86/nmi-watchdog', 'x86/pat2', 'x86/pci-ioapic-boot-irq-quirks', 'x86/ptrace', 'x86/quirks', 'x86/reboot', 'x86/setup-memory', 'x86/signal', 'x86/sparse-fixes', 'x86/time', 'x86/uv' and 'x86/xen' into x86/core
2008-12-19cpumask: documentation for cpumask_var_tMike Travis
Impact: New kerneldoc comments Additional documentation added to all the alloc_cpumask and free_cpumask functions. Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (minor additions)
2008-12-19cpumask: Add alloc_cpumask_var_node()Mike Travis
Impact: New API This will be needed in x86 code to allocate the domain and old_domain cpumasks on the same node as where the containing irq_cfg struct is allocated. (Also fixes double-dump_stack on rare CONFIG_DEBUG_PER_CPU_MAPS case) Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (re-impl alloc_cpumask_var)
2008-12-18"Tree RCU": scalable classic RCU implementationPaul E. McKenney
This patch fixes a long-standing performance bug in classic RCU that results in massive internal-to-RCU lock contention on systems with more than a few hundred CPUs. Although this patch creates a separate flavor of RCU for ease of review and patch maintenance, it is intended to replace classic RCU. This patch still handles stress better than does mainline, so I am still calling it ready for inclusion. This patch is against the -tip tree. Nevertheless, experience on an actual 1000+ CPU machine would still be most welcome. Most of the changes noted below were found while creating an rcutiny (which should permit ejecting the current rcuclassic) and while doing detailed line-by-line documentation. Updates from v9 (http://lkml.org/lkml/2008/12/2/334): o Fixes from remainder of line-by-line code walkthrough, including comment spelling, initialization, undesirable narrowing due to type conversion, removing redundant memory barriers, removing redundant local-variable initialization, and removing redundant local variables. I do not believe that any of these fixes address the CPU-hotplug issues that Andi Kleen was seeing, but please do give it a whirl in case the machine is smarter than I am. A writeup from the walkthrough may be found at the following URL, in case you are suffering from terminal insomnia or masochism: http://www.kernel.org/pub/linux/kernel/people/paulmck/tmp/rcutree-walkthrough.2008.12.16a.pdf o Made rcutree tracing use seq_file, as suggested some time ago by Lai Jiangshan. o Added a .csv variant of the rcudata debugfs trace file, to allow people having thousands of CPUs to drop the data into a spreadsheet. Tested with oocalc and gnumeric. Updated documentation to suit. Updates from v8 (http://lkml.org/lkml/2008/11/15/139): o Fix a theoretical race between grace-period initialization and force_quiescent_state() that could occur if more than three jiffies were required to carry out the grace-period initialization. Which it might, if you had enough CPUs. o Apply Ingo's printk-standardization patch. o Substitute local variables for repeated accesses to global variables. o Fix comment misspellings and redundant (but harmless) increments of ->n_rcu_pending (this latter after having explicitly added it). o Apply checkpatch fixes. Updates from v7 (http://lkml.org/lkml/2008/10/10/291): o Fixed a number of problems noted by Gautham Shenoy, including the cpu-stall-detection bug that he was having difficulty convincing me was real. ;-) o Changed cpu-stall detection to wait for ten seconds rather than three in order to reduce false positive, as suggested by Ingo Molnar. o Produced a design document (http://lwn.net/Articles/305782/). The act of writing this document uncovered a number of both theoretical and "here and now" bugs as noted below. o Fix dynticks_nesting accounting confusion, simplify WARN_ON() condition, fix kerneldoc comments, and add memory barriers in dynticks interface functions. o Add more data to tracing. o Remove unused "rcu_barrier" field from rcu_data structure. o Count calls to rcu_pending() from scheduling-clock interrupt to use as a surrogate timebase should jiffies stop counting. o Fix a theoretical race between force_quiescent_state() and grace-period initialization. Yes, initialization does have to go on for some jiffies for this race to occur, but given enough CPUs... Updates from v6 (http://lkml.org/lkml/2008/9/23/448): o Fix a number of checkpatch.pl complaints. o Apply review comments from Ingo Molnar and Lai Jiangshan on the stall-detection code. o Fix several bugs in !CONFIG_SMP builds. o Fix a misspelled config-parameter name so that RCU now announces at boot time if stall detection is configured. o Run tests on numerous combinations of configurations parameters, which after the fixes above, now build and run correctly. Updates from v5 (http://lkml.org/lkml/2008/9/15/92, bad subject line): o Fix a compiler error in the !CONFIG_FANOUT_EXACT case (blew a changeset some time ago, and finally got around to retesting this option). o Fix some tracing bugs in rcupreempt that caused incorrect totals to be printed. o I now test with a more brutal random-selection online/offline script (attached). Probably more brutal than it needs to be on the people reading it as well, but so it goes. o A number of optimizations and usability improvements: o Make rcu_pending() ignore the grace-period timeout when there is no grace period in progress. o Make force_quiescent_state() avoid going for a global lock in the case where there is no grace period in progress. o Rearrange struct fields to improve struct layout. o Make call_rcu() initiate a grace period if RCU was idle, rather than waiting for the next scheduling clock interrupt. o Invoke rcu_irq_enter() and rcu_irq_exit() only when idle, as suggested by Andi Kleen. I still don't completely trust this change, and might back it out. o Make CONFIG_RCU_TRACE be the single config variable manipulated for all forms of RCU, instead of the prior confusion. o Document tracing files and formats for both rcupreempt and rcutree. Updates from v4 for those missing v5 given its bad subject line: o Separated dynticks interface so that NMIs and irqs call separate functions, greatly simplifying it. In particular, this code no longer requires a proof of correctness. ;-) o Separated dynticks state out into its own per-CPU structure, avoiding the duplicated accounting. o The case where a dynticks-idle CPU runs an irq handler that invokes call_rcu() is now correctly handled, forcing that CPU out of dynticks-idle mode. o Review comments have been applied (thank you all!!!). For but one example, fixed the dynticks-ordering issue that Manfred pointed out, saving me much debugging. ;-) o Adjusted rcuclassic and rcupreempt to handle dynticks changes. Attached is an updated patch to Classic RCU that applies a hierarchy, greatly reducing the contention on the top-level lock for large machines. This passes 10-hour concurrent rcutorture and online-offline testing on 128-CPU ppc64 without dynticks enabled, and exposes some timekeeping bugs in presence of dynticks (exciting working on a system where "sleep 1" hangs until interrupted...), which were fixed in the 2.6.27 kernel. It is getting more reliable than mainline by some measures, so the next version will be against -tip for inclusion. See also Manfred Spraul's recent patches (or his earlier work from 2004 at http://marc.info/?l=linux-kernel&m=108546384711797&w=2). We will converge onto a common patch in the fullness of time, but are currently exploring different regions of the design space. That said, I have already gratefully stolen quite a few of Manfred's ideas. This patch provides CONFIG_RCU_FANOUT, which controls the bushiness of the RCU hierarchy. Defaults to 32 on 32-bit machines and 64 on 64-bit machines. If CONFIG_NR_CPUS is less than CONFIG_RCU_FANOUT, there is no hierarchy. By default, the RCU initialization code will adjust CONFIG_RCU_FANOUT to balance the hierarchy, so strongly NUMA architectures may choose to set CONFIG_RCU_FANOUT_EXACT to disable this balancing, allowing the hierarchy to be exactly aligned to the underlying hardware. Up to two levels of hierarchy are permitted (in addition to the root node), allowing up to 16,384 CPUs on 32-bit systems and up to 262,144 CPUs on 64-bit systems. I just know that I am going to regret saying this, but this seems more than sufficient for the foreseeable future. (Some architectures might wish to set CONFIG_RCU_FANOUT=4, which would limit such architectures to 64 CPUs. If this becomes a real problem, additional levels can be added, but I doubt that it will make a significant difference on real hardware.) In the common case, a given CPU will manipulate its private rcu_data structure and the rcu_node structure that it shares with its immediate neighbors. This can reduce both lock and memory contention by multiple orders of magnitude, which should eliminate the need for the strange manipulations that are reported to be required when running Linux on very large systems. Some shortcomings: o More bugs will probably surface as a result of an ongoing line-by-line code inspection. Patches will be provided as required. o There are probably hangs, rcutorture failures, &c. Seems quite stable on a 128-CPU machine, but that is kind of small compared to 4096 CPUs. However, seems to do better than mainline. Patches will be provided as required. o The memory footprint of this version is several KB larger than rcuclassic. A separate UP-only rcutiny patch will be provided, which will reduce the memory footprint significantly, even compared to the old rcuclassic. One such patch passes light testing, and has a memory footprint smaller even than rcuclassic. Initial reaction from various embedded guys was "it is not worth it", so am putting it aside. Credits: o Manfred Spraul for ideas, review comments, and bugs spotted, as well as some good friendly competition. ;-) o Josh Triplett, Ingo Molnar, Peter Zijlstra, Mathieu Desnoyers, Lai Jiangshan, Andi Kleen, Andy Whitcroft, and Andrew Morton for reviews and comments. o Thomas Gleixner for much-needed help with some timer issues (see patches below). o Jon M. Tollefson, Tim Pepper, Andrew Theurer, Jose R. Santos, Andy Whitcroft, Darrick Wong, Nishanth Aravamudan, Anton Blanchard, Dave Kleikamp, and Nathan Lynch for keeping machines alive despite my heavy abuse^Wtesting. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-18Merge branch 'linus' into cpus4096Ingo Molnar
2008-12-17driver core: add newlines to debugging enabled/disabled messagesMarcel Holtmann
Both messages are missing the newline and thus dmesg output gets scrambled. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-12-17driver core: fix using 'ret' variable in unregister_dynamic_debug_moduleJohann Felix Soden
The 'ret' variable is assigned, but not used in the return statement. Fix this. Signed-off-by: Johann Felix Soden <johfel@users.sourceforge.net> Acked-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-12-17swiotlb: consolidate swiotlb info message printingIan Campbell
Impact: clean up swiotlb printks Remove duplicated swiotlb info printing, and make it more detailed. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-17swiotlb: support bouncing of HighMem pagesJeremy Fitzhardinge
Impact: prepare the swiotlb code for HighMem struct pages This requires us to treat DMA regions in terms of page+offset rather than virtual addressing since a HighMem page may not have a mapping. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-17swiotlb: factor out copy to/from deviceJeremy Fitzhardinge
Impact: generalize IO bounce memcpys Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-17swiotlb: add arch hook to force mappingIan Campbell
Impact: generalize the sw-IOTLB range checks Some architectures require special rules to determine whether a range needs mapping or not. This adds a weak function for architectures to override. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-17swiotlb: allow architectures to override phys<->bus<->phys conversionsIan Campbell
Impact: generalize phys<->bus<->phys conversions in the swiotlb code Architectures may need to override these conversions. Implement a __weak hook point containing the default implementation. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-17swiotlb: add comment where we handle the overflow of a dma mask on 32 bitIan Campbell
Impact: cleanup Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-16swiotlb: move some definitions to headerIan Campbell
Impact: cleanup Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>