Merge tag 'docs-5.2a' of git://git.lwn.net/linux

author Linus Torvalds <torvalds@linux-foundation.org>

Fri, 10 May 2019 17:24:53 +0000 (13:24 -0400)

committer Linus Torvalds <torvalds@linux-foundation.org>

Fri, 10 May 2019 17:24:53 +0000 (13:24 -0400)
author Linus Torvalds <torvalds@linux-foundation.org>
Fri, 10 May 2019 17:24:53 +0000 (13:24 -0400)
committer Linus Torvalds <torvalds@linux-foundation.org>
Fri, 10 May 2019 17:24:53 +0000 (13:24 -0400)
diff --cc Documentation/index.rst
Simple merge
diff --cc Documentation/x86/kernel-stacks.rst

index 0000000000000000000000000000000000000000,c7c7afce086ff31a9158f78349986374650e896d..6b0bcf027ff1eda7736ba0965a81a2ab614d86ab

mode 000000,100644..100644
--- /dev/null
--- 2/Documentation/x86/kernel-stacks.rst
+++ b/Documentation/x86/kernel-stacks.rst
@@@ -1,0 -1,147 +1,152 @@@
- -* DOUBLEFAULT_STACK.  EXCEPTION_STKSZ (PAGE_SIZE).
+ .. SPDX-License-Identifier: GPL-2.0
+ 
+ =============
+ Kernel Stacks
+ =============
+ 
+ Kernel stacks on x86-64 bit
+ ===========================
+ 
+ Most of the text from Keith Owens, hacked by AK
+ 
+ x86_64 page size (PAGE_SIZE) is 4K.
+ 
+ Like all other architectures, x86_64 has a kernel stack for every
+ active thread.  These thread stacks are THREAD_SIZE (2*PAGE_SIZE) big.
+ These stacks contain useful data as long as a thread is alive or a
+ zombie. While the thread is in user space the kernel stack is empty
+ except for the thread_info structure at the bottom.
+ 
+ In addition to the per thread stacks, there are specialized stacks
+ associated with each CPU.  These stacks are only used while the kernel
+ is in control on that CPU; when a CPU returns to user space the
+ specialized stacks contain no useful data.  The main CPU stacks are:
+ 
+ * Interrupt stack.  IRQ_STACK_SIZE
+ 
+   Used for external hardware interrupts.  If this is the first external
+   hardware interrupt (i.e. not a nested hardware interrupt) then the
+   kernel switches from the current task to the interrupt stack.  Like
+   the split thread and interrupt stacks on i386, this gives more room
+   for kernel interrupt processing without having to increase the size
+   of every per thread stack.
+ 
+   The interrupt stack is also used when processing a softirq.
+ 
+ Switching to the kernel interrupt stack is done by software based on a
+ per CPU interrupt nest counter. This is needed because x86-64 "IST"
+ hardware stacks cannot nest without races.
+ 
+ x86_64 also has a feature which is not available on i386, the ability
+ to automatically switch to a new stack for designated events such as
+ double fault or NMI, which makes it easier to handle these unusual
+ events on x86_64.  This feature is called the Interrupt Stack Table
+ (IST).  There can be up to 7 IST entries per CPU. The IST code is an
+ index into the Task State Segment (TSS). The IST entries in the TSS
+ point to dedicated stacks; each stack can be a different size.
+ 
+ An IST is selected by a non-zero value in the IST field of an
+ interrupt-gate descriptor.  When an interrupt occurs and the hardware
+ loads such a descriptor, the hardware automatically sets the new stack
+ pointer based on the IST value, then invokes the interrupt handler.  If
+ the interrupt came from user mode, then the interrupt handler prologue
+ will switch back to the per-thread stack.  If software wants to allow
+ nested IST interrupts then the handler must adjust the IST values on
+ entry to and exit from the interrupt handler.  (This is occasionally
+ done, e.g. for debug exceptions.)
+ 
+ Events with different IST codes (i.e. with different stacks) can be
+ nested.  For example, a debug interrupt can safely be interrupted by an
+ NMI.  arch/x86_64/kernel/entry.S::paranoidentry adjusts the stack
+ pointers on entry to and exit from all IST events, in theory allowing
+ IST events with the same code to be nested.  However in most cases, the
+ stack size allocated to an IST assumes no nesting for the same code.
+ If that assumption is ever broken then the stacks will become corrupt.
+ 
+ The currently assigned IST stacks are:
+ 
- -* NMI_STACK.  EXCEPTION_STKSZ (PAGE_SIZE).
++* ESTACK_DF.  EXCEPTION_STKSZ (PAGE_SIZE).
+ 
+   Used for interrupt 8 - Double Fault Exception (#DF).
+ 
+   Invoked when handling one exception causes another exception. Happens
+   when the kernel is very confused (e.g. kernel stack pointer corrupt).
+   Using a separate stack allows the kernel to recover from it well enough
+   in many cases to still output an oops.
+ 
- -* DEBUG_STACK.  DEBUG_STKSZ
++* ESTACK_NMI.  EXCEPTION_STKSZ (PAGE_SIZE).
+ 
+   Used for non-maskable interrupts (NMI).
+ 
+   NMI can be delivered at any time, including when the kernel is in the
+   middle of switching stacks.  Using IST for NMI events avoids making
+   assumptions about the previous state of the kernel stack.
+ 
- -* MCE_STACK.  EXCEPTION_STKSZ (PAGE_SIZE).
++* ESTACK_DB.  EXCEPTION_STKSZ (PAGE_SIZE).
+ 
+   Used for hardware debug interrupts (interrupt 1) and for software
+   debug interrupts (INT3).
+ 
+   When debugging a kernel, debug interrupts (both hardware and
+   software) can occur at any time.  Using IST for these interrupts
+   avoids making assumptions about the previous state of the kernel
+   stack.
+ 
++  To handle nested #DB correctly there exist two instances of DB stacks. On
++  #DB entry the IST stackpointer for #DB is switched to the second instance
++  so a nested #DB starts from a clean stack. The nested #DB switches
++  the IST stackpointer to a guard hole to catch triple nesting.
++
++* ESTACK_MCE.  EXCEPTION_STKSZ (PAGE_SIZE).
+ 
+   Used for interrupt 18 - Machine Check Exception (#MC).
+ 
+   MCE can be delivered at any time, including when the kernel is in the
+   middle of switching stacks.  Using IST for MCE events avoids making
+   assumptions about the previous state of the kernel stack.
+ 
+ For more details see the Intel IA32 or AMD AMD64 architecture manuals.
+ 
+ 
+ Printing backtraces on x86
+ ==========================
+ 
+ The question about the '?' preceding function names in an x86 stacktrace
+ keeps popping up, here's an indepth explanation. It helps if the reader
+ stares at print_context_stack() and the whole machinery in and around
+ arch/x86/kernel/dumpstack.c.
+ 
+ Adapted from Ingo's mail, Message-ID: <20150521101614.GA10889@gmail.com>:
+ 
+ We always scan the full kernel stack for return addresses stored on
+ the kernel stack(s) [1]_, from stack top to stack bottom, and print out
+ anything that 'looks like' a kernel text address.
+ 
+ If it fits into the frame pointer chain, we print it without a question
+ mark, knowing that it's part of the real backtrace.
+ 
+ If the address does not fit into our expected frame pointer chain we
+ still print it, but we print a '?'. It can mean two things:
+ 
+  - either the address is not part of the call chain: it's just stale
+    values on the kernel stack, from earlier function calls. This is
+    the common case.
+ 
+  - or it is part of the call chain, but the frame pointer was not set
+    up properly within the function, so we don't recognize it.
+ 
+ This way we will always print out the real call chain (plus a few more
+ entries), regardless of whether the frame pointer was set up correctly
+ or not - but in most cases we'll get the call chain right as well. The
+ entries printed are strictly in stack order, so you can deduce more
+ information from that as well.
+ 
+ The most important property of this method is that we _never_ lose
+ information: we always strive to print _all_ addresses on the stack(s)
+ that look like kernel text addresses, so if debug information is wrong,
+ we still print out the real call chain as well - just with more question
+ marks than ideal.
+ 
+ .. [1] For things like IRQ and IST stacks, we also scan those stacks, in
+        the right order, and try to cross from one stack into another
+        reconstructing the call chain. This works most of the time.
diff --cc Documentation/x86/topology.rst

index 0000000000000000000000000000000000000000,5176e5315faa15e8181f0e2881a4ce901ccb4de0..6e28dbe818ab122d32dbecd9a03b569f04cb5894

mode 000000,100644..100644
--- /dev/null
--- 2/Documentation/x86/topology.rst
+++ b/Documentation/x86/topology.rst
@@@ -1,0 -1,221 +1,221 @@@
- -  - cpuinfo_x86.logical_id:
+ .. SPDX-License-Identifier: GPL-2.0
+ 
+ ============
+ x86 Topology
+ ============
+ 
+ This documents and clarifies the main aspects of x86 topology modelling and
+ representation in the kernel. Update/change when doing changes to the
+ respective code.
+ 
+ The architecture-agnostic topology definitions are in
+ Documentation/cputopology.txt. This file holds x86-specific
+ differences/specialities which must not necessarily apply to the generic
+ definitions. Thus, the way to read up on Linux topology on x86 is to start
+ with the generic one and look at this one in parallel for the x86 specifics.
+ 
+ Needless to say, code should use the generic functions - this file is *only*
+ here to *document* the inner workings of x86 topology.
+ 
+ Started by Thomas Gleixner <tglx@linutronix.de> and Borislav Petkov <bp@alien8.de>.
+ 
+ The main aim of the topology facilities is to present adequate interfaces to
+ code which needs to know/query/use the structure of the running system wrt
+ threads, cores, packages, etc.
+ 
+ The kernel does not care about the concept of physical sockets because a
+ socket has no relevance to software. It's an electromechanical component. In
+ the past a socket always contained a single package (see below), but with the
+ advent of Multi Chip Modules (MCM) a socket can hold more than one package. So
+ there might be still references to sockets in the code, but they are of
+ historical nature and should be cleaned up.
+ 
+ The topology of a system is described in the units of:
+ 
+     - packages
+     - cores
+     - threads
+ 
+ Package
+ =======
+ Packages contain a number of cores plus shared resources, e.g. DRAM
+ controller, shared caches etc.
+ 
+ AMD nomenclature for package is 'Node'.
+ 
+ Package-related topology information in the kernel:
+ 
+   - cpuinfo_x86.x86_max_cores:
+ 
+     The number of cores in a package. This information is retrieved via CPUID.
+ 
+   - cpuinfo_x86.phys_proc_id:
+ 
+     The physical ID of the package. This information is retrieved via CPUID
+     and deduced from the APIC IDs of the cores in the package.
+ 
++  - cpuinfo_x86.logical_proc_id:
+ 
+     The logical ID of the package. As we do not trust BIOSes to enumerate the
+     packages in a consistent way, we introduced the concept of logical package
+     ID so we can sanely calculate the number of maximum possible packages in
+     the system and have the packages enumerated linearly.
+ 
+   - topology_max_packages():
+ 
+     The maximum possible number of packages in the system. Helpful for per
+     package facilities to preallocate per package information.
+ 
+   - cpu_llc_id:
+ 
+     A per-CPU variable containing:
+ 
+       - On Intel, the first APIC ID of the list of CPUs sharing the Last Level
+         Cache
+ 
+       - On AMD, the Node ID or Core Complex ID containing the Last Level
+         Cache. In general, it is a number identifying an LLC uniquely on the
+         system.
+ 
+ Cores
+ =====
+ A core consists of 1 or more threads. It does not matter whether the threads
+ are SMT- or CMT-type threads.
+ 
+ AMDs nomenclature for a CMT core is "Compute Unit". The kernel always uses
+ "core".
+ 
+ Core-related topology information in the kernel:
+ 
+   - smp_num_siblings:
+ 
+     The number of threads in a core. The number of threads in a package can be
+     calculated by::
+ 
+       threads_per_package = cpuinfo_x86.x86_max_cores * smp_num_siblings
+ 
+ 
+ Threads
+ =======
+ A thread is a single scheduling unit. It's the equivalent to a logical Linux
+ CPU.
+ 
+ AMDs nomenclature for CMT threads is "Compute Unit Core". The kernel always
+ uses "thread".
+ 
+ Thread-related topology information in the kernel:
+ 
+   - topology_core_cpumask():
+ 
+     The cpumask contains all online threads in the package to which a thread
+     belongs.
+ 
+     The number of online threads is also printed in /proc/cpuinfo "siblings."
+ 
+   - topology_sibling_cpumask():
+ 
+     The cpumask contains all online threads in the core to which a thread
+     belongs.
+ 
+   - topology_logical_package_id():
+ 
+     The logical package ID to which a thread belongs.
+ 
+   - topology_physical_package_id():
+ 
+     The physical package ID to which a thread belongs.
+ 
+   - topology_core_id();
+ 
+     The ID of the core to which a thread belongs. It is also printed in /proc/cpuinfo
+     "core_id."
+ 
+ 
+ 
+ System topology examples
+ ========================
+ 
+ .. note::
+   The alternative Linux CPU enumeration depends on how the BIOS enumerates the
+   threads. Many BIOSes enumerate all threads 0 first and then all threads 1.
+   That has the "advantage" that the logical Linux CPU numbers of threads 0 stay
+   the same whether threads are enabled or not. That's merely an implementation
+   detail and has no practical impact.
+ 
+ 1) Single Package, Single Core::
+ 
+    [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
+ 
+ 2) Single Package, Dual Core
+ 
+    a) One thread per core::
+ 
+       [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
+                   -> [core 1] -> [thread 0] -> Linux CPU 1
+ 
+    b) Two threads per core::
+ 
+       [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
+                               -> [thread 1] -> Linux CPU 1
+                   -> [core 1] -> [thread 0] -> Linux CPU 2
+                               -> [thread 1] -> Linux CPU 3
+ 
+       Alternative enumeration::
+ 
+       [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
+                               -> [thread 1] -> Linux CPU 2
+                   -> [core 1] -> [thread 0] -> Linux CPU 1
+                               -> [thread 1] -> Linux CPU 3
+ 
+       AMD nomenclature for CMT systems::
+ 
+       [node 0] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 0
+                                    -> [Compute Unit Core 1] -> Linux CPU 1
+                -> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 2
+                                    -> [Compute Unit Core 1] -> Linux CPU 3
+ 
+ 4) Dual Package, Dual Core
+ 
+    a) One thread per core::
+ 
+       [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
+                   -> [core 1] -> [thread 0] -> Linux CPU 1
+ 
+       [package 1] -> [core 0] -> [thread 0] -> Linux CPU 2
+                   -> [core 1] -> [thread 0] -> Linux CPU 3
+ 
+    b) Two threads per core::
+ 
+       [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
+                               -> [thread 1] -> Linux CPU 1
+                   -> [core 1] -> [thread 0] -> Linux CPU 2
+                               -> [thread 1] -> Linux CPU 3
+ 
+       [package 1] -> [core 0] -> [thread 0] -> Linux CPU 4
+                               -> [thread 1] -> Linux CPU 5
+                   -> [core 1] -> [thread 0] -> Linux CPU 6
+                               -> [thread 1] -> Linux CPU 7
+ 
+       Alternative enumeration::
+ 
+       [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
+                               -> [thread 1] -> Linux CPU 4
+                   -> [core 1] -> [thread 0] -> Linux CPU 1
+                               -> [thread 1] -> Linux CPU 5
+ 
+       [package 1] -> [core 0] -> [thread 0] -> Linux CPU 2
+                               -> [thread 1] -> Linux CPU 6
+                   -> [core 1] -> [thread 0] -> Linux CPU 3
+                               -> [thread 1] -> Linux CPU 7
+ 
+       AMD nomenclature for CMT systems::
+ 
+       [node 0] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 0
+                                    -> [Compute Unit Core 1] -> Linux CPU 1
+                -> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 2
+                                    -> [Compute Unit Core 1] -> Linux CPU 3
+ 
+       [node 1] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 4
+                                    -> [Compute Unit Core 1] -> Linux CPU 5
+                -> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 6
+                                    -> [Compute Unit Core 1] -> Linux CPU 7
diff --cc Documentation/x86/x86_64/mm.rst

index 0000000000000000000000000000000000000000,52020577b8de2c05b102443ca84d2173adf708a7..267fc4808945121aecf632737df7d4aade9be3c4

mode 000000,100644..100644
--- /dev/null
--- 2/Documentation/x86/x86_64/mm.rst
+++ b/Documentation/x86/x86_64/mm.rst
@@@ -1,0 -1,161 +1,161 @@@
- -   from 0.125 PB to 64 PB. All kernel mappings shift down to the -64 PT starting
+ .. SPDX-License-Identifier: GPL-2.0
+ 
+ ================
+ Memory Managment
+ ================
+ 
+ Complete virtual memory map with 4-level page tables
+ ====================================================
+ 
+ .. note::
+ 
+  - Negative addresses such as "-23 TB" are absolute addresses in bytes, counted down
+    from the top of the 64-bit address space. It's easier to understand the layout
+    when seen both in absolute addresses and in distance-from-top notation.
+ 
+    For example 0xffffe90000000000 == -23 TB, it's 23 TB lower than the top of the
+    64-bit address space (ffffffffffffffff).
+ 
+    Note that as we get closer to the top of the address space, the notation changes
+    from TB to GB and then MB/KB.
+ 
+  - "16M TB" might look weird at first sight, but it's an easier to visualize size
+    notation than "16 EB", which few will recognize at first sight as 16 exabytes.
+    It also shows it nicely how incredibly large 64-bit address space is.
+ 
+ ::
+ 
+   ========================================================================================================================
+       Start addr    |   Offset   |     End addr     |  Size   | VM area description
+   ========================================================================================================================
+                     |            |                  |         |
+    0000000000000000 |    0       | 00007fffffffffff |  128 TB | user-space virtual memory, different per mm
+   __________________|____________|__________________|_________|___________________________________________________________
+                     |            |                  |         |
+    0000800000000000 | +128    TB | ffff7fffffffffff | ~16M TB | ... huge, almost 64 bits wide hole of non-canonical
+                     |            |                  |         |     virtual memory addresses up to the -128 TB
+                     |            |                  |         |     starting offset of kernel mappings.
+   __________________|____________|__________________|_________|___________________________________________________________
+                                                               |
+                                                               | Kernel-space virtual memory, shared between all processes:
+   ____________________________________________________________|___________________________________________________________
+                     |            |                  |         |
+    ffff800000000000 | -128    TB | ffff87ffffffffff |    8 TB | ... guard hole, also reserved for hypervisor
+    ffff880000000000 | -120    TB | ffff887fffffffff |  0.5 TB | LDT remap for PTI
+    ffff888000000000 | -119.5  TB | ffffc87fffffffff |   64 TB | direct mapping of all physical memory (page_offset_base)
+    ffffc88000000000 |  -55.5  TB | ffffc8ffffffffff |  0.5 TB | ... unused hole
+    ffffc90000000000 |  -55    TB | ffffe8ffffffffff |   32 TB | vmalloc/ioremap space (vmalloc_base)
+    ffffe90000000000 |  -23    TB | ffffe9ffffffffff |    1 TB | ... unused hole
+    ffffea0000000000 |  -22    TB | ffffeaffffffffff |    1 TB | virtual memory map (vmemmap_base)
+    ffffeb0000000000 |  -21    TB | ffffebffffffffff |    1 TB | ... unused hole
+    ffffec0000000000 |  -20    TB | fffffbffffffffff |   16 TB | KASAN shadow memory
+   __________________|____________|__________________|_________|____________________________________________________________
+                                                               |
+                                                               | Identical layout to the 56-bit one from here on:
+   ____________________________________________________________|____________________________________________________________
+                     |            |                  |         |
+    fffffc0000000000 |   -4    TB | fffffdffffffffff |    2 TB | ... unused hole
+                     |            |                  |         | vaddr_end for KASLR
+    fffffe0000000000 |   -2    TB | fffffe7fffffffff |  0.5 TB | cpu_entry_area mapping
+    fffffe8000000000 |   -1.5  TB | fffffeffffffffff |  0.5 TB | ... unused hole
+    ffffff0000000000 |   -1    TB | ffffff7fffffffff |  0.5 TB | %esp fixup stacks
+    ffffff8000000000 | -512    GB | ffffffeeffffffff |  444 GB | ... unused hole
+    ffffffef00000000 |  -68    GB | fffffffeffffffff |   64 GB | EFI region mapping space
+    ffffffff00000000 |   -4    GB | ffffffff7fffffff |    2 GB | ... unused hole
+    ffffffff80000000 |   -2    GB | ffffffff9fffffff |  512 MB | kernel text mapping, mapped to physical address 0
+    ffffffff80000000 |-2048    MB |                  |         |
+    ffffffffa0000000 |-1536    MB | fffffffffeffffff | 1520 MB | module mapping space
+    ffffffffff000000 |  -16    MB |                  |         |
+       FIXADDR_START | ~-11    MB | ffffffffff5fffff | ~0.5 MB | kernel-internal fixmap range, variable size and offset
+    ffffffffff600000 |  -10    MB | ffffffffff600fff |    4 kB | legacy vsyscall ABI
+    ffffffffffe00000 |   -2    MB | ffffffffffffffff |    2 MB | ... unused hole
+   __________________|____________|__________________|_________|___________________________________________________________
+ 
+ 
+ Complete virtual memory map with 5-level page tables
+ ====================================================
+ 
+ .. note::
+ 
+  - With 56-bit addresses, user-space memory gets expanded by a factor of 512x,
- -   0000800000000000 |  +64    PB | ffff7fffffffffff | ~16K PB | ... huge, still almost 64 bits wide hole of non-canonical
++   from 0.125 PB to 64 PB. All kernel mappings shift down to the -64 PB starting
+    offset and many of the regions expand to support the much larger physical
+    memory supported.
+ 
+ ::
+ 
+   ========================================================================================================================
+       Start addr    |   Offset   |     End addr     |  Size   | VM area description
+   ========================================================================================================================
+                     |            |                  |         |
+    0000000000000000 |    0       | 00ffffffffffffff |   64 PB | user-space virtual memory, different per mm
+   __________________|____________|__________________|_________|___________________________________________________________
+                     |            |                  |         |
- -   ffdf000000000000 |   -8.25 PB | fffffdffffffffff |   ~8 PB | KASAN shadow memory
++   0100000000000000 |  +64    PB | feffffffffffffff | ~16K PB | ... huge, still almost 64 bits wide hole of non-canonical
+                     |            |                  |         |     virtual memory addresses up to the -64 PB
+                     |            |                  |         |     starting offset of kernel mappings.
+   __________________|____________|__________________|_________|___________________________________________________________
+                                                               |
+                                                               | Kernel-space virtual memory, shared between all processes:
+   ____________________________________________________________|___________________________________________________________
+                     |            |                  |         |
+    ff00000000000000 |  -64    PB | ff0fffffffffffff |    4 PB | ... guard hole, also reserved for hypervisor
+    ff10000000000000 |  -60    PB | ff10ffffffffffff | 0.25 PB | LDT remap for PTI
+    ff11000000000000 |  -59.75 PB | ff90ffffffffffff |   32 PB | direct mapping of all physical memory (page_offset_base)
+    ff91000000000000 |  -27.75 PB | ff9fffffffffffff | 3.75 PB | ... unused hole
+    ffa0000000000000 |  -24    PB | ffd1ffffffffffff | 12.5 PB | vmalloc/ioremap space (vmalloc_base)
+    ffd2000000000000 |  -11.5  PB | ffd3ffffffffffff |  0.5 PB | ... unused hole
+    ffd4000000000000 |  -11    PB | ffd5ffffffffffff |  0.5 PB | virtual memory map (vmemmap_base)
+    ffd6000000000000 |  -10.5  PB | ffdeffffffffffff | 2.25 PB | ... unused hole
++   ffdf000000000000 |   -8.25 PB | fffffbffffffffff |   ~8 PB | KASAN shadow memory
+   __________________|____________|__________________|_________|____________________________________________________________
+                                                               |
+                                                               | Identical layout to the 47-bit one from here on:
+   ____________________________________________________________|____________________________________________________________
+                     |            |                  |         |
+    fffffc0000000000 |   -4    TB | fffffdffffffffff |    2 TB | ... unused hole
+                     |            |                  |         | vaddr_end for KASLR
+    fffffe0000000000 |   -2    TB | fffffe7fffffffff |  0.5 TB | cpu_entry_area mapping
+    fffffe8000000000 |   -1.5  TB | fffffeffffffffff |  0.5 TB | ... unused hole
+    ffffff0000000000 |   -1    TB | ffffff7fffffffff |  0.5 TB | %esp fixup stacks
+    ffffff8000000000 | -512    GB | ffffffeeffffffff |  444 GB | ... unused hole
+    ffffffef00000000 |  -68    GB | fffffffeffffffff |   64 GB | EFI region mapping space
+    ffffffff00000000 |   -4    GB | ffffffff7fffffff |    2 GB | ... unused hole
+    ffffffff80000000 |   -2    GB | ffffffff9fffffff |  512 MB | kernel text mapping, mapped to physical address 0
+    ffffffff80000000 |-2048    MB |                  |         |
+    ffffffffa0000000 |-1536    MB | fffffffffeffffff | 1520 MB | module mapping space
+    ffffffffff000000 |  -16    MB |                  |         |
+       FIXADDR_START | ~-11    MB | ffffffffff5fffff | ~0.5 MB | kernel-internal fixmap range, variable size and offset
+    ffffffffff600000 |  -10    MB | ffffffffff600fff |    4 kB | legacy vsyscall ABI
+    ffffffffffe00000 |   -2    MB | ffffffffffffffff |    2 MB | ... unused hole
+   __________________|____________|__________________|_________|___________________________________________________________
+ 
+ Architecture defines a 64-bit virtual address. Implementations can support
+ less. Currently supported are 48- and 57-bit virtual addresses. Bits 63
+ through to the most-significant implemented bit are sign extended.
+ This causes hole between user space and kernel addresses if you interpret them
+ as unsigned.
+ 
+ The direct mapping covers all memory in the system up to the highest
+ memory address (this means in some cases it can also include PCI memory
+ holes).
+ 
+ vmalloc space is lazily synchronized into the different PML4/PML5 pages of
+ the processes using the page fault handler, with init_top_pgt as
+ reference.
+ 
+ We map EFI runtime services in the 'efi_pgd' PGD in a 64Gb large virtual
+ memory window (this size is arbitrary, it can be raised later if needed).
+ The mappings are not part of any other kernel PGD and are only available
+ during EFI runtime calls.
+ 
+ Note that if CONFIG_RANDOMIZE_MEMORY is enabled, the direct mapping of all
+ physical memory, vmalloc/ioremap space and virtual memory map are randomized.
+ Their order is preserved but their base will be offset early at boot time.
+ 
+ Be very careful vs. KASLR when changing anything here. The KASLR address
+ range must not overlap with anything except the KASAN shadow area, which is
+ correct as KASAN disables KASLR.
+ 
+ For both 4- and 5-level layouts, the STACKLEAK_POISON value in the last 2MB
+ hole: ffffffffffff4111
author	Linus Torvalds <torvalds@linux-foundation.org>
	Fri, 10 May 2019 17:24:53 +0000 (13:24 -0400)
committer	Linus Torvalds <torvalds@linux-foundation.org>
	Fri, 10 May 2019 17:24:53 +0000 (13:24 -0400)
		1	2
Documentation/index.rst	patch \|	diff1 \|	diff2 \|	blob \| history
Documentation/x86/kernel-stacks.rst	patch \|	\|	diff2 \|	blob \| history
Documentation/x86/topology.rst	patch \|	\|	diff2 \|	blob \| history
Documentation/x86/x86_64/mm.rst	patch \|	\|	diff2 \|	blob \| history