x86: reinstate numa remap for SPARSEMEM on x86 NUMA systems
Recent kernels have been panic'ing trying to allocate memory early in boot,
in __alloc_pages:
BUG: unable to handle kernel paging request at
00001568
IP: [<
c10407b6>] __alloc_pages+0x33/0x2cc
*pdpt =
00000000013a5001 *pde =
0000000000000000
Oops: 0000 [#1] SMP
Modules linked in:
Pid: 1, comm: swapper Not tainted (2.6.25 #78)
EIP: 0060:[<
c10407b6>] EFLAGS:
00010246 CPU: 0
EIP is at __alloc_pages+0x33/0x2cc
EAX:
00001564 EBX:
000412d0 ECX:
00001564 EDX:
000005c3
ESI:
f78012a0 EDI:
00000001 EBP:
00001564 ESP:
f7871e50
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process swapper (pid: 1, ti=
f7870000 task=
f786f670 task.ti=
f7870000)
Stack:
00000000 f786f670 00000010 00000000 0000b700 000412d0 f78012a0 00000001
00000000 c105b64d 00000000 000412d0 f78012a0 f7803120 00000000 c105c1c5
00000010 f7803144 000412d0 00000001 f7803130 f7803120 f78012a0 00000001
Call Trace:
[<
c105b64d>] kmem_getpages+0x94/0x129
[<
c105c1c5>] cache_grow+0x8f/0x123
[<
c105c689>] ____cache_alloc_node+0xb9/0xe4
[<
c105c999>] kmem_cache_alloc_node+0x92/0xd2
[<
c1018929>] build_sched_domains+0x536/0x70d
[<
c100b63c>] do_flush_tlb_all+0x0/0x3f
[<
c100b63c>] do_flush_tlb_all+0x0/0x3f
[<
c10572d6>] interleave_nodes+0x23/0x5a
[<
c105c44f>] alternate_node_alloc+0x43/0x5b
[<
c1018b47>] arch_init_sched_domains+0x46/0x51
[<
c136e85e>] kernel_init+0x0/0x82
[<
c137ac19>] sched_init_smp+0x10/0xbb
[<
c136e8a1>] kernel_init+0x43/0x82
[<
c10035cf>] kernel_thread_helper+0x7/0x10
Debugging this showed that the NODE_DATA() for nodes other than node 0
were all NULL. Tracing this back showed that the NODE_DATA() pointers
were being initialised to each nodes remap space. However under
SPARSEMEM remap is disabled which leads to the pgdat's being placed
incorrectly at kernel virtual address 0. Leading to the panic when
attempting to allocate memory from these nodes.
Numa remap was disabled in the commit below. This occured while fixing
problems triggered when attempting to boot x86_32 NUMA SPARSEMEM kernels
on non-numa hardware.
x86: make NUMA work on 32-bit
commit
1b000a5dbeb2f34bc03d45ebdf3f6d24a60c3aed
The real problem is believed to be related to other alignment issues in
the regions blocked out from the bootmem allocator for small memory
systems, and has been fixed separately. Therefore re-enable remap for
SPARSMEM, which fixes pgdat allocation issues. Testing confirms that
SPARSMEM NUMA kernels will boot correctly with this part of the change
reverted.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Ingo Molnar <mingo@elte.hu>