aboutsummaryrefslogtreecommitdiff
path: root/include/linux/memory_hotplug.h
AgeCommit message (Collapse)Author
2008-04-28memory hotplug: register section/node id to freeYasunori Goto
This patch set is to free pages which is allocated by bootmem for memory-hotremove. Some structures of memory management are allocated by bootmem. ex) memmap, etc. To remove memory physically, some of them must be freed according to circumstance. This patch set makes basis to free those pages, and free memmaps. Basic my idea is using remain members of struct page to remember information of users of bootmem (section number or node id). When the section is removing, kernel can confirm it. By this information, some issues can be solved. 1) When the memmap of removing section is allocated on other section by bootmem, it should/can be free. 2) When the memmap of removing section is allocated on the same section, it shouldn't be freed. Because the section has to be logical memory offlined already and all pages must be isolated against page allocater. If it is freed, page allocator may use it which will be removed physically soon. 3) When removing section has other section's memmap, kernel will be able to show easily which section should be removed before it for user. (Not implemented yet) 4) When the above case 2), the page isolation will be able to check and skip memmap's page when logical memory offline (offline_pages()). Current page isolation code fails in this case because this page is just reserved page and it can't distinguish this pages can be removed or not. But, it will be able to do by this patch. (Not implemented yet.) 5) The node information like pgdat has similar issues. But, this will be able to be solved too by this. (Not implemented yet, but, remembering node id in the pages.) Fortunately, current bootmem allocator just keeps PageReserved flags, and doesn't use any other members of page struct. The users of bootmem doesn't use them too. This patch: This is to register information which is node or section's id. Kernel can distinguish which node/section uses the pages allcated by bootmem. This is basis for hot-remove sections or nodes. Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Badari Pulavarty <pbadari@us.ibm.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-28hotplug memory remove: generic __remove_pages() supportBadari Pulavarty
Generic helper function to remove section mappings and sysfs entries for the section of the memory we are removing. offline_pages() correctly adjusted zone and marked the pages reserved. TODO: Yasunori Goto is working on patches to free up allocations from bootmem. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Acked-by: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16fix memory hot remove not configured case.KAMEZAWA Hiroyuki
Now, arch dependent code around CONFIG_MEMORY_HOTREMOVE is a mess. This patch cleans up them. This is against 2.6.23-rc6-mm1. - fix compile failure on ia64/ CONFIG_MEMORY_HOTPLUG && !CONFIG_MEMORY_HOTREMOVE case. - For !CONFIG_MEMORY_HOTREMOVE, add generic no-op remove_memory(), which returns -EINVAL. - removed remove_pages() only used in powerpc. - removed no-op remove_memory() in i386, sh, sparc64, x86_64. - only powerpc returns -ENOSYS at memory hot remove(no-op). changes it to return -EINVAL. Note: Currently, only ia64 supports CONFIG_MEMORY_HOTREMOVE. I welcome other archs if there are requirements and testers. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16memory unplug: page offlineKAMEZAWA Hiroyuki
Logic. - set all pages in [start,end) as isolated migration-type. by this, all free pages in the range will be not-for-use. - Migrate all LRU pages in the range. - Test all pages in the range's refcnt is zero or not. Todo: - allocate migration destination page from better area. - confirm page_count(page)== 0 && PageReserved(page) page is safe to be freed.. (I don't like this kind of page but.. - Find out pages which cannot be migrated. - more running tests. - Use reclaim for unplugging other memory type area. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16memory unplug: memory hotplug cleanupKAMEZAWA Hiroyuki
A clean up patch for "scanning memory resource [start, end)" operation. Now, find_next_system_ram() function is used in memory hotplug, but this interface is not easy to use and codes are complicated. This patch adds walk_memory_resouce(start,len,arg,func) function. The function 'func' is called per valid memory resouce range in [start,pfn). [pbadari@us.ibm.com: Error handling in walk_memory_resource()] Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16Clean up duplicate includes in include/linux/memory_hotplug.hJesper Juhl
This patch cleans up duplicate includes in include/linux/memory_hotplug.h Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Acked-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2006-10-01[PATCH] hot-add-mem x86_64: fixup externsKeith Mannthey
Fix up externs in memory_hotplug.c. Cleanup. Signed-off-by: Keith Mannthey <kmannth@us.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] pgdat allocation and update for ia64 of memory hotplug: allocate ↵Yasunori Goto
pgdat and per node data This is a patch to allocate pgdat and per node data area for ia64. The size for them can be calculated by compute_pernodesize(). Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] pgdat allocation and update for ia64 of memory hotplug: update pgdat ↵Yasunori Goto
address array This is to refresh node_data[] array for ia64. As I mentioned previous patches, ia64 has copies of information of pgdat address array on each node as per node data. At v2 of node_add, this function used stop_machine_run() to update them. (I wished that they were copied safety as much as possible.) But, in this patch, this arrays are just copied simply, and set node_online_map bit after completion of pgdat initialization. So, kernel must touch NODE_DATA() macro after checking node_online_map(). (Current code has already done it.) This is more simple way for just hot-add..... Note : It will be problem when hot-remove will occur, because, even if online_map bit is set, kernel may touch NODE_DATA() due to race condition. :-( Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] pgdat allocation for new node add (refresh node_data[])Yasunori Goto
Refresh NODE_DATA() for generic archs. In this case, NODE_DATA(nid) == node_data[nid]. node_data[] is array of address of pgdat. So, refresh is quite simple. Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: "Brown, Len" <len.brown@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] pgdat allocation for new node add (generic alloc node_data)Yasunori Goto
For node hotplug, basically we have to allocate new pgdat. But, there are several types of implementations of pgdat. 1. Allocate only pgdat. This style allocate only pgdat area. And its address is recorded in node_data[]. It is most popular style. 2. Static array of pgdat In this case, all of pgdats are static array. Some archs use this style. 3. Allocate not only pgdat, but also per node data. To increase performance, each node has copy of some data as a per node data. So, this area must be allocated too. Ia64 is this style. Ia64 has the copies of node_data[] array on each per node data to increase performance. In this series of patches, treat (1) as generic arch. generic archs can use generic function. (2) and (3) should have its own if necessary. This patch defines pgdat allocator. Updating NODE_DATA() macro function is in other patch. Signed-off-by: Yasonori Goto <y-goto@jp.fujitsu.com> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: "Brown, Len" <len.brown@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27[PATCH] pgdat allocation for new node add (specify node id)Yasunori Goto
Change the name of old add_memory() to arch_add_memory. And use node id to get pgdat for the node at NODE_DATA(). Note: Powerpc's old add_memory() is defined as __devinit. However, add_memory() is usually called only after bootup. I suppose it may be redundant. But, I'm not well known about powerpc. So, I keep it. (But, __meminit is better at least.) Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: "Brown, Len" <len.brown@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-20[PATCH] memory_hotplug.h cleanupAdrian Bunk
We don't have to #if guard prototypes. This also fixes a bug observed by Randy Dunlap due to a misspelled option in the #if. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-09[PATCH] x86_64: Support memory hotadd without sparsememAndi Kleen
Memory hotadd doesn't need SPARSEMEM, but can be handled by just preallocating mem_maps. This only needs some untangling of ifdefs to enable the necessary code even without SPARSEMEM. Originally from Keith Mannthey, hacked by AK. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-06[PATCH] memory-hotplug compile fixKAMEZAWA Hiroyuki
include/linux/memory_hotplug.h:53: warning: 'struct page' declared inside parameter list (akpm: I tossed in a couple more possibly-needed-sometime struct decls too) Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29[PATCH] memory hotplug: sysfs and add/remove functionsDave Hansen
This adds generic memory add/remove and supporting functions for memory hotplug into a new file as well as a memory hotplug kernel config option. Individual architecture patches will follow. For now, disable memory hotplug when swsusp is enabled. There's a lot of churn there right now. We'll fix it up properly once it calms down. Signed-off-by: Matt Tolentino <matthew.e.tolentino@intel.com> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29[PATCH] memory hotplug locking: zone span seqlockDave Hansen
See the "fixup bad_range()" patch for more information, but this actually creates a the lock to protect things making assumptions about a zone's size staying constant at runtime. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29[PATCH] memory hotplug locking: node_size_lockDave Hansen
pgdat->node_size_lock is basically only neeeded in one place in the normal code: show_mem(), which is the arch-specific sysrq-m printing function. Strictly speaking, the architectures not doing memory hotplug do no need this locking in show_mem(). However, they are all included for completeness. This should also make any future consolidation of all of the implementations a little more straightforward. This lock is also held in the sparsemem code during a memory removal, as sections are invalidated. This is the place there pfn_valid() is made false for a memory area that's being removed. The lock is only required when doing pfn_valid() operations on memory which the user does not already have a reference on the page, such as in show_mem(). Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>