In long mode the %cs is largely a relic. However there are a few cases
like iret where it matters that we have a valid value. Without this
patch it is possible to enter the kernel in startup_64 without setting
%cs to a valid value. With this patch we don't care what %cs value
we enter the kernel with, so long as the cs shadow register indicates
it is a privileged code segment.
Thanks to Magnus Damm for finding this problem and posting the
first workable patch. I have moved the jump to set %cs down a
few instructions so we don't need to take an extra jump. Which
keeps the code simpler.
Signed-of-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andi Kleen <ak@suse.de>
/* Finally jump to run C code and to be on real kernel address
* Since we are running on identity-mapped space we have to jump
- * to the full 64bit address , this is only possible as indirect
- * jump
+ * to the full 64bit address, this is only possible as indirect
+ * jump. In addition we need to ensure %cs is set so we make this
+ * a far return.
*/
movq initial_code(%rip),%rax
- pushq $0 # fake return address
- jmp *%rax
+ pushq $0 # fake return address to stop unwinder
+ pushq $__KERNEL_CS # set correct cs
+ pushq %rax # target address in negative space
+ lretq
/* SMP bootup changes these two */
.align 8