From 98a79d6a50181ca1ecf7400eda01d5dc1bc0dbf0 Mon Sep 17 00:00:00 2001 From: Rusty Russell Date: Sat, 13 Dec 2008 21:19:41 +1030 Subject: cpumask: centralize cpu_online_map and cpu_possible_map Impact: cleanup Each SMP arch defines these themselves. Move them to a central location. Twists: 1) Some archs (m32, parisc, s390) set possible_map to all 1, so we add a CONFIG_INIT_ALL_POSSIBLE for this rather than break them. 2) mips and sparc32 '#define cpu_possible_map phys_cpu_present_map'. Those archs simply have phys_cpu_present_map replaced everywhere. 3) Alpha defined cpu_possible_map to cpu_present_map; this is tricky so I just manipulate them both in sync. 4) IA64, cris and m32r have gratuitous 'extern cpumask_t cpu_possible_map' declarations. Signed-off-by: Rusty Russell Reviewed-by: Grant Grundler Tested-by: Tony Luck Acked-by: Ingo Molnar Cc: Mike Travis Cc: ink@jurassic.park.msu.ru Cc: rmk@arm.linux.org.uk Cc: starvik@axis.com Cc: tony.luck@intel.com Cc: takata@linux-m32r.org Cc: ralf@linux-mips.org Cc: grundler@parisc-linux.org Cc: paulus@samba.org Cc: schwidefsky@de.ibm.com Cc: lethal@linux-sh.org Cc: wli@holomorphy.com Cc: davem@davemloft.net Cc: jdike@addtoit.com Cc: mingo@redhat.com --- init/Kconfig | 9 +++++++++ 1 file changed, 9 insertions(+) (limited to 'init/Kconfig') diff --git a/init/Kconfig b/init/Kconfig index f763762d544..7656623f500 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -916,6 +916,15 @@ config KMOD endif # MODULES +config INIT_ALL_POSSIBLE + bool + help + Back when each arch used to define their own cpu_online_map and + cpu_possible_map, some of them chose to initialize cpu_possible_map + with all 1s, and others with all 0s. When they were centralised, + it was better to provide this option than to break all the archs + and have several arch maintainers persuing me down dark alleys. + config STOP_MACHINE bool default y -- cgit v1.2.3 From 64db4cfff99c04cd5f550357edcc8780f96b54a2 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 18 Dec 2008 21:55:32 +0100 Subject: "Tree RCU": scalable classic RCU implementation This patch fixes a long-standing performance bug in classic RCU that results in massive internal-to-RCU lock contention on systems with more than a few hundred CPUs. Although this patch creates a separate flavor of RCU for ease of review and patch maintenance, it is intended to replace classic RCU. This patch still handles stress better than does mainline, so I am still calling it ready for inclusion. This patch is against the -tip tree. Nevertheless, experience on an actual 1000+ CPU machine would still be most welcome. Most of the changes noted below were found while creating an rcutiny (which should permit ejecting the current rcuclassic) and while doing detailed line-by-line documentation. Updates from v9 (http://lkml.org/lkml/2008/12/2/334): o Fixes from remainder of line-by-line code walkthrough, including comment spelling, initialization, undesirable narrowing due to type conversion, removing redundant memory barriers, removing redundant local-variable initialization, and removing redundant local variables. I do not believe that any of these fixes address the CPU-hotplug issues that Andi Kleen was seeing, but please do give it a whirl in case the machine is smarter than I am. A writeup from the walkthrough may be found at the following URL, in case you are suffering from terminal insomnia or masochism: http://www.kernel.org/pub/linux/kernel/people/paulmck/tmp/rcutree-walkthrough.2008.12.16a.pdf o Made rcutree tracing use seq_file, as suggested some time ago by Lai Jiangshan. o Added a .csv variant of the rcudata debugfs trace file, to allow people having thousands of CPUs to drop the data into a spreadsheet. Tested with oocalc and gnumeric. Updated documentation to suit. Updates from v8 (http://lkml.org/lkml/2008/11/15/139): o Fix a theoretical race between grace-period initialization and force_quiescent_state() that could occur if more than three jiffies were required to carry out the grace-period initialization. Which it might, if you had enough CPUs. o Apply Ingo's printk-standardization patch. o Substitute local variables for repeated accesses to global variables. o Fix comment misspellings and redundant (but harmless) increments of ->n_rcu_pending (this latter after having explicitly added it). o Apply checkpatch fixes. Updates from v7 (http://lkml.org/lkml/2008/10/10/291): o Fixed a number of problems noted by Gautham Shenoy, including the cpu-stall-detection bug that he was having difficulty convincing me was real. ;-) o Changed cpu-stall detection to wait for ten seconds rather than three in order to reduce false positive, as suggested by Ingo Molnar. o Produced a design document (http://lwn.net/Articles/305782/). The act of writing this document uncovered a number of both theoretical and "here and now" bugs as noted below. o Fix dynticks_nesting accounting confusion, simplify WARN_ON() condition, fix kerneldoc comments, and add memory barriers in dynticks interface functions. o Add more data to tracing. o Remove unused "rcu_barrier" field from rcu_data structure. o Count calls to rcu_pending() from scheduling-clock interrupt to use as a surrogate timebase should jiffies stop counting. o Fix a theoretical race between force_quiescent_state() and grace-period initialization. Yes, initialization does have to go on for some jiffies for this race to occur, but given enough CPUs... Updates from v6 (http://lkml.org/lkml/2008/9/23/448): o Fix a number of checkpatch.pl complaints. o Apply review comments from Ingo Molnar and Lai Jiangshan on the stall-detection code. o Fix several bugs in !CONFIG_SMP builds. o Fix a misspelled config-parameter name so that RCU now announces at boot time if stall detection is configured. o Run tests on numerous combinations of configurations parameters, which after the fixes above, now build and run correctly. Updates from v5 (http://lkml.org/lkml/2008/9/15/92, bad subject line): o Fix a compiler error in the !CONFIG_FANOUT_EXACT case (blew a changeset some time ago, and finally got around to retesting this option). o Fix some tracing bugs in rcupreempt that caused incorrect totals to be printed. o I now test with a more brutal random-selection online/offline script (attached). Probably more brutal than it needs to be on the people reading it as well, but so it goes. o A number of optimizations and usability improvements: o Make rcu_pending() ignore the grace-period timeout when there is no grace period in progress. o Make force_quiescent_state() avoid going for a global lock in the case where there is no grace period in progress. o Rearrange struct fields to improve struct layout. o Make call_rcu() initiate a grace period if RCU was idle, rather than waiting for the next scheduling clock interrupt. o Invoke rcu_irq_enter() and rcu_irq_exit() only when idle, as suggested by Andi Kleen. I still don't completely trust this change, and might back it out. o Make CONFIG_RCU_TRACE be the single config variable manipulated for all forms of RCU, instead of the prior confusion. o Document tracing files and formats for both rcupreempt and rcutree. Updates from v4 for those missing v5 given its bad subject line: o Separated dynticks interface so that NMIs and irqs call separate functions, greatly simplifying it. In particular, this code no longer requires a proof of correctness. ;-) o Separated dynticks state out into its own per-CPU structure, avoiding the duplicated accounting. o The case where a dynticks-idle CPU runs an irq handler that invokes call_rcu() is now correctly handled, forcing that CPU out of dynticks-idle mode. o Review comments have been applied (thank you all!!!). For but one example, fixed the dynticks-ordering issue that Manfred pointed out, saving me much debugging. ;-) o Adjusted rcuclassic and rcupreempt to handle dynticks changes. Attached is an updated patch to Classic RCU that applies a hierarchy, greatly reducing the contention on the top-level lock for large machines. This passes 10-hour concurrent rcutorture and online-offline testing on 128-CPU ppc64 without dynticks enabled, and exposes some timekeeping bugs in presence of dynticks (exciting working on a system where "sleep 1" hangs until interrupted...), which were fixed in the 2.6.27 kernel. It is getting more reliable than mainline by some measures, so the next version will be against -tip for inclusion. See also Manfred Spraul's recent patches (or his earlier work from 2004 at http://marc.info/?l=linux-kernel&m=108546384711797&w=2). We will converge onto a common patch in the fullness of time, but are currently exploring different regions of the design space. That said, I have already gratefully stolen quite a few of Manfred's ideas. This patch provides CONFIG_RCU_FANOUT, which controls the bushiness of the RCU hierarchy. Defaults to 32 on 32-bit machines and 64 on 64-bit machines. If CONFIG_NR_CPUS is less than CONFIG_RCU_FANOUT, there is no hierarchy. By default, the RCU initialization code will adjust CONFIG_RCU_FANOUT to balance the hierarchy, so strongly NUMA architectures may choose to set CONFIG_RCU_FANOUT_EXACT to disable this balancing, allowing the hierarchy to be exactly aligned to the underlying hardware. Up to two levels of hierarchy are permitted (in addition to the root node), allowing up to 16,384 CPUs on 32-bit systems and up to 262,144 CPUs on 64-bit systems. I just know that I am going to regret saying this, but this seems more than sufficient for the foreseeable future. (Some architectures might wish to set CONFIG_RCU_FANOUT=4, which would limit such architectures to 64 CPUs. If this becomes a real problem, additional levels can be added, but I doubt that it will make a significant difference on real hardware.) In the common case, a given CPU will manipulate its private rcu_data structure and the rcu_node structure that it shares with its immediate neighbors. This can reduce both lock and memory contention by multiple orders of magnitude, which should eliminate the need for the strange manipulations that are reported to be required when running Linux on very large systems. Some shortcomings: o More bugs will probably surface as a result of an ongoing line-by-line code inspection. Patches will be provided as required. o There are probably hangs, rcutorture failures, &c. Seems quite stable on a 128-CPU machine, but that is kind of small compared to 4096 CPUs. However, seems to do better than mainline. Patches will be provided as required. o The memory footprint of this version is several KB larger than rcuclassic. A separate UP-only rcutiny patch will be provided, which will reduce the memory footprint significantly, even compared to the old rcuclassic. One such patch passes light testing, and has a memory footprint smaller even than rcuclassic. Initial reaction from various embedded guys was "it is not worth it", so am putting it aside. Credits: o Manfred Spraul for ideas, review comments, and bugs spotted, as well as some good friendly competition. ;-) o Josh Triplett, Ingo Molnar, Peter Zijlstra, Mathieu Desnoyers, Lai Jiangshan, Andi Kleen, Andy Whitcroft, and Andrew Morton for reviews and comments. o Thomas Gleixner for much-needed help with some timer issues (see patches below). o Jon M. Tollefson, Tim Pepper, Andrew Theurer, Jose R. Santos, Andy Whitcroft, Darrick Wong, Nishanth Aravamudan, Anton Blanchard, Dave Kleikamp, and Nathan Lynch for keeping machines alive despite my heavy abuse^Wtesting. Signed-off-by: Paul E. McKenney Signed-off-by: Ingo Molnar --- init/Kconfig | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) (limited to 'init/Kconfig') diff --git a/init/Kconfig b/init/Kconfig index f763762d544..9dd7958a71f 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -928,10 +928,16 @@ source "block/Kconfig" config PREEMPT_NOTIFIERS bool -config CLASSIC_RCU - def_bool !PREEMPT_RCU +config TREE_RCU_TRACE + def_bool RCU_TRACE && TREE_RCU + select DEBUG_FS help - This option selects the classic RCU implementation that is - designed for best read-side performance on non-realtime - systems. Classic RCU is the default. Note that the - PREEMPT_RCU symbol is used to select/deselect this option. + This option provides tracing for the TREE_RCU implementation, + permitting Makefile to trivially select kernel/rcutree_trace.c. + +config PREEMPT_RCU_TRACE + def_bool RCU_TRACE && PREEMPT_RCU + select DEBUG_FS + help + This option provides tracing for the PREEMPT_RCU implementation, + permitting Makefile to trivially select kernel/rcupreempt_trace.c. -- cgit v1.2.3 From 12d79bafb75639f406a9f71aab94808c414c836e Mon Sep 17 00:00:00 2001 From: Ingo Molnar Date: Thu, 25 Dec 2008 09:31:28 +0100 Subject: rcu: provide RCU options on non-preempt architectures too Impact: build fix Some old architectures still do not use kernel/Kconfig.preempt, so the moving of the RCU options there broke their build: In file included from /home/mingo/tip/include/linux/sem.h:81, from /home/mingo/tip/include/linux/sched.h:69, from /home/mingo/tip/arch/alpha/kernel/asm-offsets.c:9: /home/mingo/tip/include/linux/rcupdate.h:62:2: error: #error "Unknown RCU implementation specified to kernel configuration" Move these options back to init/Kconfig, which every architecture includes. Signed-off-by: Ingo Molnar --- init/Kconfig | 74 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 74 insertions(+) (limited to 'init/Kconfig') diff --git a/init/Kconfig b/init/Kconfig index 9dd7958a71f..6b0fdedf359 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -928,6 +928,80 @@ source "block/Kconfig" config PREEMPT_NOTIFIERS bool +choice + prompt "RCU Implementation" + default CLASSIC_RCU + +config CLASSIC_RCU + bool "Classic RCU" + help + This option selects the classic RCU implementation that is + designed for best read-side performance on non-realtime + systems. + + Select this option if you are unsure. + +config TREE_RCU + bool "Tree-based hierarchical RCU" + help + This option selects the RCU implementation that is + designed for very large SMP system with hundreds or + thousands of CPUs. + +config PREEMPT_RCU + bool "Preemptible RCU" + depends on PREEMPT + help + This option reduces the latency of the kernel by making certain + RCU sections preemptible. Normally RCU code is non-preemptible, if + this option is selected then read-only RCU sections become + preemptible. This helps latency, but may expose bugs due to + now-naive assumptions about each RCU read-side critical section + remaining on a given CPU through its execution. + +endchoice + +config RCU_TRACE + bool "Enable tracing for RCU" + depends on TREE_RCU || PREEMPT_RCU + help + This option provides tracing in RCU which presents stats + in debugfs for debugging RCU implementation. + + Say Y here if you want to enable RCU tracing + Say N if you are unsure. + +config RCU_FANOUT + int "Tree-based hierarchical RCU fanout value" + range 2 64 if 64BIT + range 2 32 if !64BIT + depends on TREE_RCU + default 64 if 64BIT + default 32 if !64BIT + help + This option controls the fanout of hierarchical implementations + of RCU, allowing RCU to work efficiently on machines with + large numbers of CPUs. This value must be at least the cube + root of NR_CPUS, which allows NR_CPUS up to 32,768 for 32-bit + systems and up to 262,144 for 64-bit systems. + + Select a specific number if testing RCU itself. + Take the default if unsure. + +config RCU_FANOUT_EXACT + bool "Disable tree-based hierarchical RCU auto-balancing" + depends on TREE_RCU + default n + help + This option forces use of the exact RCU_FANOUT value specified, + regardless of imbalances in the hierarchy. This is useful for + testing RCU itself, and might one day be useful on systems with + strong NUMA behavior. + + Without RCU_FANOUT_EXACT, the code will balance the hierarchy. + + Say N if unsure. + config TREE_RCU_TRACE def_bool RCU_TRACE && TREE_RCU select DEBUG_FS -- cgit v1.2.3 From a327ca2c2674c5a9a0073421df19bfc362698136 Mon Sep 17 00:00:00 2001 From: Johannes Berg Date: Tue, 8 Jul 2008 19:00:26 +0200 Subject: remove CONFIG_KMOD Now that nothing depends on it any more, remove CONFIG_KMOD. Signed-off-by: Johannes Berg Signed-off-by: Rusty Russell --- init/Kconfig | 6 ------ 1 file changed, 6 deletions(-) (limited to 'init/Kconfig') diff --git a/init/Kconfig b/init/Kconfig index f6281711166..52847eec739 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -916,12 +916,6 @@ config MODULE_SRCVERSION_ALL the version). With this option, such a "srcversion" field will be created for all modules. If unsure, say N. -config KMOD - def_bool y - help - This is being removed soon. These days, CONFIG_MODULES - implies CONFIG_KMOD, so use that instead. - endif # MODULES config INIT_ALL_POSSIBLE -- cgit v1.2.3 From fce3e804cfad49208fd2ff9db4bcb57481409e1d Mon Sep 17 00:00:00 2001 From: Kay Sievers Date: Sat, 1 Nov 2008 14:03:00 +0100 Subject: sysfs: clarify SYSFS_DEPRECATED help text This should make the help text of SYSFS_DEPRECATED more clear, that this is _not_ about (what some people think it is) suppressing a few symlinks and variables, but a different sysfs _layout_ with new features. Signed-off-by: Kay Sievers Signed-off-by: Greg Kroah-Hartman --- init/Kconfig | 44 +++++++++++++++++++++++++++----------------- 1 file changed, 27 insertions(+), 17 deletions(-) (limited to 'init/Kconfig') diff --git a/init/Kconfig b/init/Kconfig index 52847eec739..d9d3dbabdb1 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -423,27 +423,37 @@ config SYSFS_DEPRECATED bool config SYSFS_DEPRECATED_V2 - bool "Create deprecated sysfs files" + bool "Create deprecated sysfs layout for older userspace tools" depends on SYSFS default y select SYSFS_DEPRECATED help - This option creates deprecated symlinks such as the - "device"-link, the :-link, and the - "bus"-link. It may also add deprecated key in the - uevent environment. - None of these features or values should be used today, as - they export driver core implementation details to userspace - or export properties which can't be kept stable across kernel - releases. - - If enabled, this option will also move any device structures - that belong to a class, back into the /sys/class hierarchy, in - order to support older versions of udev and some userspace - programs. - - If you are using a distro with the most recent userspace - packages, it should be safe to say N here. + This option switches the layout of sysfs to the deprecated + version. + + The current sysfs layout features a unified device tree at + /sys/devices/, which is able to express a hierarchy between + class devices. If the deprecated option is set to Y, the + unified device tree is split into a bus device tree at + /sys/devices/ and several individual class device trees at + /sys/class/. The class and bus devices will be connected by + ":" and the "device" links. The "block" + class devices, will not show up in /sys/class/block/. Some + subsystems will suppress the creation of some devices which + depend on the unified device tree. + + This option is not a pure compatibility option that can + be safely enabled on newer distributions. It will change the + layout of sysfs to the non-extensible deprecated version, + and disable some features, which can not be exported without + confusing older userspace tools. Since 2007/2008 all major + distributions do not enable this option, and ship no tools which + depend on the deprecated layout or this option. + + If you are using a new kernel on an older distribution, or use + older userspace tools, you might need to say Y here. Do not say Y, + if the original kernel, that came with your distribution, has + this option set to N. config PROC_PID_CPUSET bool "Include legacy /proc//cpuset file" -- cgit v1.2.3 From 853ac43ab194f5051b27a55060215d696dc9480d Mon Sep 17 00:00:00 2001 From: Matt Mackall Date: Tue, 6 Jan 2009 14:40:20 -0800 Subject: shmem: unify regular and tiny shmem tiny-shmem shares most of its 130 lines of code with shmem and tends to break when particular bits of shmem get modified. Unifying saves code and makes keeping these two in sync much easier. before: 14367 392 24 14783 39bf mm/shmem.o 396 72 8 476 1dc mm/tiny-shmem.o after: 14367 392 24 14783 39bf mm/shmem.o 412 72 8 492 1ec mm/shmem.o tiny Signed-off-by: Matt Mackall Acked-by: Hugh Dickins Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- init/Kconfig | 4 ---- 1 file changed, 4 deletions(-) (limited to 'init/Kconfig') diff --git a/init/Kconfig b/init/Kconfig index 52847eec739..315a6114bf8 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -838,10 +838,6 @@ config RT_MUTEXES boolean select PLIST -config TINY_SHMEM - default !SHMEM - bool - config BASE_SMALL int default 0 if BASE_FULL -- cgit v1.2.3 From 5cdc38f98596662620b822a4e13f797c3f2f65e0 Mon Sep 17 00:00:00 2001 From: KAMEZAWA Hiroyuki Date: Wed, 7 Jan 2009 18:07:30 -0800 Subject: cgroups: make cgroup config a submenu Making CGROUP related configs be a sub-menu. This patch make CGROUP related configs be a sub-menu and makes 1st level configs of "General Setup" shorter. including following additional changes - add help comment about CGROUPS and GROUP_SCHED. - moved MM_OWNER config to the bottom. (for good indent in menuconfig) Signed-off-by: KAMEZAWA Hiroyuki Reviewed-by: Daisuke Nishimura Cc: Balbir Singh Cc: Paul Menage Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- init/Kconfig | 123 ++++++++++++++++++++++++++++++++--------------------------- 1 file changed, 67 insertions(+), 56 deletions(-) (limited to 'init/Kconfig') diff --git a/init/Kconfig b/init/Kconfig index e7893b1d3e4..6fcd192aa60 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -271,59 +271,6 @@ config LOG_BUF_SHIFT 13 => 8 KB 12 => 4 KB -config CGROUPS - bool "Control Group support" - help - This option will let you use process cgroup subsystems - such as Cpusets - - Say N if unsure. - -config CGROUP_DEBUG - bool "Example debug cgroup subsystem" - depends on CGROUPS - default n - help - This option enables a simple cgroup subsystem that - exports useful debugging information about the cgroups - framework - - Say N if unsure - -config CGROUP_NS - bool "Namespace cgroup subsystem" - depends on CGROUPS - help - Provides a simple namespace cgroup subsystem to - provide hierarchical naming of sets of namespaces, - for instance virtual servers and checkpoint/restart - jobs. - -config CGROUP_FREEZER - bool "control group freezer subsystem" - depends on CGROUPS - help - Provides a way to freeze and unfreeze all tasks in a - cgroup. - -config CGROUP_DEVICE - bool "Device controller for cgroups" - depends on CGROUPS && EXPERIMENTAL - help - Provides a cgroup implementing whitelists for devices which - a process in the cgroup can mknod or open. - -config CPUSETS - bool "Cpuset support" - depends on SMP && CGROUPS - help - This option will let you create and manage CPUSETs which - allow dynamically partitioning a system into sets of CPUs and - Memory Nodes and assigning tasks to run only within those sets. - This is primarily useful on large SMP or NUMA systems. - - Say N if unsure. - # # Architectures with an unreliable sched_clock() should select this: # @@ -337,6 +284,8 @@ config GROUP_SCHED help This feature lets CPU scheduler recognize task groups and control CPU bandwidth allocation to such task groups. + In order to create a group from arbitrary set of processes, use + CONFIG_CGROUPS. (See Control Group support.) config FAIR_GROUP_SCHED bool "Group scheduling for SCHED_OTHER" @@ -379,6 +328,66 @@ config CGROUP_SCHED endchoice +menu "Control Group support" +config CGROUPS + bool "Control Group support" + help + This option add support for grouping sets of processes together, for + use with process control subsystems such as Cpusets, CFS, memory + controls or device isolation. + See + - Documentation/cpusets.txt (Cpusets) + - Documentation/scheduler/sched-design-CFS.txt (CFS) + - Documentation/cgroups/ (features for grouping, isolation) + - Documentation/controllers/ (features for resource control) + + Say N if unsure. + +config CGROUP_DEBUG + bool "Example debug cgroup subsystem" + depends on CGROUPS + default n + help + This option enables a simple cgroup subsystem that + exports useful debugging information about the cgroups + framework + + Say N if unsure + +config CGROUP_NS + bool "Namespace cgroup subsystem" + depends on CGROUPS + help + Provides a simple namespace cgroup subsystem to + provide hierarchical naming of sets of namespaces, + for instance virtual servers and checkpoint/restart + jobs. + +config CGROUP_FREEZER + bool "control group freezer subsystem" + depends on CGROUPS + help + Provides a way to freeze and unfreeze all tasks in a + cgroup. + +config CGROUP_DEVICE + bool "Device controller for cgroups" + depends on CGROUPS && EXPERIMENTAL + help + Provides a cgroup implementing whitelists for devices which + a process in the cgroup can mknod or open. + +config CPUSETS + bool "Cpuset support" + depends on SMP && CGROUPS + help + This option will let you create and manage CPUSETs which + allow dynamically partitioning a system into sets of CPUs and + Memory Nodes and assigning tasks to run only within those sets. + This is primarily useful on large SMP or NUMA systems. + + Say N if unsure. + config CGROUP_CPUACCT bool "Simple CPU accounting cgroup subsystem" depends on CGROUPS @@ -393,9 +402,6 @@ config RESOURCE_COUNTERS infrastructure that works with cgroups depends on CGROUPS -config MM_OWNER - bool - config CGROUP_MEM_RES_CTLR bool "Memory Resource Controller for Control Groups" depends on CGROUPS && RESOURCE_COUNTERS @@ -419,6 +425,11 @@ config CGROUP_MEM_RES_CTLR This config option also selects MM_OWNER config option, which could in turn add some fork/exit overhead. +config MM_OWNER + bool + +endmenu + config SYSFS_DEPRECATED bool -- cgit v1.2.3 From c9d5409f8d46fd0d18b4a4481d9caa04076d87fc Mon Sep 17 00:00:00 2001 From: Li Zefan Date: Wed, 7 Jan 2009 18:07:35 -0800 Subject: memcg: fix a typo in Kconfig s/contoller/controller/ Signed-of-by: Li Zefan Cc: Paul Menage Acked-by: KAMEZAWA Hiroyuki Cc: Balbir Singh Cc: Pavel Emelyanov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- init/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'init/Kconfig') diff --git a/init/Kconfig b/init/Kconfig index 6fcd192aa60..7cbe1f43ca2 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -420,7 +420,7 @@ config CGROUP_MEM_RES_CTLR sure you need the memory resource controller. Even when you enable this, you can set "cgroup_disable=memory" at your boot option to disable memory resource controller and you can avoid overheads. - (and lose benefits of memory resource contoller) + (and lose benefits of memory resource controller) This config option also selects MM_OWNER config option, which could in turn add some fork/exit overhead. -- cgit v1.2.3 From c077719be8e9e6b55702117513d1b5f41d80404a Mon Sep 17 00:00:00 2001 From: KAMEZAWA Hiroyuki Date: Wed, 7 Jan 2009 18:07:57 -0800 Subject: memcg: mem+swap controller Kconfig Config and control variable for mem+swap controller. This patch adds CONFIG_CGROUP_MEM_RES_CTLR_SWAP (memory resource controller swap extension.) For accounting swap, it's obvious that we have to use additional memory to remember "who uses swap". This adds more overhead. So, it's better to offer "choice" to users. This patch adds 2 choices. This patch adds 2 parameters to enable swap extension or not. - CONFIG - boot option Reviewed-by: Daisuke Nishimura Signed-off-by: KAMEZAWA Hiroyuki Cc: Li Zefan Cc: Balbir Singh Cc: Pavel Emelyanov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- init/Kconfig | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) (limited to 'init/Kconfig') diff --git a/init/Kconfig b/init/Kconfig index 7cbe1f43ca2..a724a149bf3 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -428,6 +428,23 @@ config CGROUP_MEM_RES_CTLR config MM_OWNER bool +config CGROUP_MEM_RES_CTLR_SWAP + bool "Memory Resource Controller Swap Extension(EXPERIMENTAL)" + depends on CGROUP_MEM_RES_CTLR && SWAP && EXPERIMENTAL + help + Add swap management feature to memory resource controller. When you + enable this, you can limit mem+swap usage per cgroup. In other words, + when you disable this, memory resource controller has no cares to + usage of swap...a process can exhaust all of the swap. This extension + is useful when you want to avoid exhaustion swap but this itself + adds more overheads and consumes memory for remembering information. + Especially if you use 32bit system or small memory system, please + be careful about enabling this. When memory resource controller + is disabled by boot option, this will be automatically disabled and + there will be no overhead from this. Even when you set this config=y, + if boot option "noswapaccount" is set, swap will not be accounted. + + endmenu config SYSFS_DEPRECATED -- cgit v1.2.3