[PATCH] slab: fix two issues in kmalloc_node / __cache_alloc_node

This addresses two issues: 1. Kmalloc_node() may intermittently return NULL if we are allocating from the current node and are unable to obtain memory for the current node from the page allocator. This is because we call ___cache_alloc() if nodeid == numa_node_id() and ____cache_alloc is not able to fallback to other nodes. This was introduced in the 2.6.19 development cycle. <= 2.6.18 in that case does not do a restricted allocation and blindly trusts the page allocator to have given us memory from the indicated node. It inserts the page regardless of the node it came from into the queues for the current node. 2. If kmalloc_node() is used on a node that has not been bootstrapped yet then we may try to pass an invalid node number to ____cache_alloc_node() triggering a BUG(). Change the function to call fallback_alloc() instead. Only call fallback_alloc() if we are allowed to fallback at all. The need to handle a node not bootstrapped yet also first surfaced in the 2.6.19 cycle. Update the comments since they were still describing the old kmalloc_node from 2.6.12. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
author: Christoph Lameter <clameter@sgi.com> 2006-12-06 20:33:24 -0800
committer: Linus Torvalds <torvalds@woody.osdl.org> 2006-12-07 08:39:25 -0800
commit: 5bcd234d881d83ac0259c6d42d98f134e31c60a8 (patch)
tree: 40d58218ce224200336c449ba035bcb6ec119d89
parent: 1b1cec4bbc59feac89670d5d6d222a02545bac94 (diff)
1 files changed, 28 insertions, 12 deletions
diff --git a/mm/slab.c b/mm/slab.c
index bb831ba63e1..6da554fd3f6 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3459,29 +3459,45 @@ out:
  * @flags: See kmalloc().
  * @nodeid: node number of the target node.
  *
- * Identical to kmem_cache_alloc, except that this function is slow
- * and can sleep. And it will allocate memory on the given node, which
- * can improve the performance for cpu bound structures.
- * New and improved: it will now make sure that the object gets
- * put on the correct node list so that there is no false sharing.
+ * Identical to kmem_cache_alloc but it will allocate memory on the given
+ * node, which can improve the performance for cpu bound structures.
+ *
+ * Fallback to other node is possible if __GFP_THISNODE is not set.
  */
 static __always_inline void *
 __cache_alloc_node(struct kmem_cache *cachep, gfp_t flags,
 		int nodeid, void *caller)
 {
 	unsigned long save_flags;
-	void *ptr;
+	void *ptr = NULL;
 
 	cache_alloc_debugcheck_before(cachep, flags);
 	local_irq_save(save_flags);
 
-	if (nodeid == -1 || nodeid == numa_node_id() ||
-			!cachep->nodelists[nodeid])
-		ptr = ____cache_alloc(cachep, flags);
-	else
-		ptr = ____cache_alloc_node(cachep, flags, nodeid);
-	local_irq_restore(save_flags);
+	if (unlikely(nodeid == -1))
+		nodeid = numa_node_id();
 
+	if (likely(cachep->nodelists[nodeid])) {
+		if (nodeid == numa_node_id()) {
+			/*
+			 * Use the locally cached objects if possible.
+			 * However ____cache_alloc does not allow fallback
+			 * to other nodes. It may fail while we still have
+			 * objects on other nodes available.
+			 */
+			ptr = ____cache_alloc(cachep, flags);
+		}
+		if (!ptr) {
+			/* ___cache_alloc_node can fall back to other nodes */
+			ptr = ____cache_alloc_node(cachep, flags, nodeid);
+		}
+	} else {
+		/* Node not bootstrapped yet */
+		if (!(flags & __GFP_THISNODE))
+			ptr = fallback_alloc(cachep, flags);
+	}
+
+	local_irq_restore(save_flags);
 	ptr = cache_alloc_debugcheck_after(cachep, flags, ptr, caller);
 
 	return ptr;
author	Christoph Lameter <clameter@sgi.com>	2006-12-06 20:33:24 -0800
committer	Linus Torvalds <torvalds@woody.osdl.org>	2006-12-07 08:39:25 -0800
commit	5bcd234d881d83ac0259c6d42d98f134e31c60a8 (patch)
tree	40d58218ce224200336c449ba035bcb6ec119d89
parent	1b1cec4bbc59feac89670d5d6d222a02545bac94 (diff)