Start to Finish: Switching to Long Mode
From XOmB wiki
This page is part of a series on XBB From Start to Finish.
|
For reference: .
This section, within the series XBB From Start to Finish, is dedicated to explaining the second step to a full boot in our GRUB based x86-64 XOmB Bare Bones distribution.
Contents |
[edit] Receiving the GRUB Baton
This is a small overview of what was described by the previous part of the series. GRUB starts, as does the x86 processor proper, in x86 real mode. GRUB, as a boot loader, sets up an environment for the kernel to be executed. In the bare bones, GRUB makes the jump from the 16 bit real mode into 32 bit protected mode.
GRUB will load the kernel executable into memory at 0x100000 (or 1MB, in user-speak). It will look for the multiboot header within the module and use this header to learn about the entry point to call once the module is loaded. It does this by searching the module for the GRUB Magic Number, for reference: defined at MULTIBOOT_HEADER_MAGIC.
Once it finds the entry point, it will jump to that address with the magic number set in register EAX, and the address to the multiboot structure in EBX. It will be the kernel's responsibility to handle the information given to it from there onward.
[edit] Entry Point: _start
For reference: .
Ref: . The first thing that this initial code does is save the multiboot information based by GRUB. It does this by moving the values into registers we swear not to touch which are used by the ABI to pass parameters in 64 bit mode, namely RDI and RSI which are used as the parameters to eventually kmain.
Ref: . Then the kernel will disable interrupts, which should be done when given control by GRUB. We have to set up the interrupt mechanisms and we do not want some chicken and the egg type problem interfering.
From here we go into the exceptionally odd process of switching to long mode.
[edit] Supporting Long Mode
Ref: . The code then turns on the Physical Address Extension, or PAE. In short, it allows for more than 32-bits of addressable space, which is obviously a concern of 64-bit operating systems. You turn this on using system register CR4 and setting bit 5.
Ref: . It proceeds to set the page table that will be used once long mode is enabled. This page table is a 4-Level behemoth. It is only meant to be a temporary page table, however. Notice that the data structures are defined within boot.s . The particular chunk referenced defines the fourth level (which is the first level to be traversed by hardware during a page address translation). Each succeeding chunk is defined below it.
Notice that for each individual level, there are 512 entries. For levels 4, 3 and 2, these entries point to lesser levels. 4 points to 3, 3 point to 2, and so on. For our tables, ref: , we have defined particular entries.
For the fourth level, we have defined entry 0 and entry 256. This is because we want to eventually execute kernel code in the higher memory regions and leave lower memory reserved for use as a consistent location for userspace applications and libraries. So, we will map level 0 and level 256 to point to the same level 3 page. The level 3 page that is shared simply maps in the same level 2 page at entry 0. The page level 2 maps enough page level 1 entries to map several dozen megabytes of physical memory, where each level 1 entry fully maps 2MB of memory. From there, it is the same, and the flexibility of 4 level pages tables is suddenly apparent. (Also notice, as an aside, that the tables must be aligned to a 4K memory boundary.)
Ref: Back in the main execution path, we advance to the next set of code. Here, the instructions SYSCALL and SYSRET, which are new and efficient methods of invoking system calls on the x86 hardware, are enabled. SYSCALL and SYSRET are new to the x86 with the x86-64 architecture, but are also sometimes available on 32 bit machines. They provide a method of going from userspace (ring 3) to the kernel (ring 0).
Everything is set up to get us into long mode. However, in a curiosity of legacy, we have to establish a 32 bit environment first.
[edit] Setting up 32-bit Protected Mode
The next couple of lines load special registers on the architecture. These registers hold pointers to specific structures required by the CPU to facilitate various things.
Ref: . The first structure is the GDT, or the Global Descriptor Table. The mechanics behind this relate back to the days of segmented address spaces and are mostly irrelevant to modern interests. The definition of the table is . It starts with a couple of small values that indicated the location and size of the structure. The table itself is defined underneath. It consists of several entries that serve as indicators to the position, size, and security of those sections (segments) of memory. For more information, consult the GDT main page linked within this paragraph.
Ref: . Next is the IDT, or the Interrupt Descriptor Table. ref (in an old revision): This is defined similarly to the GDT, but contains entries for each accepted interrupt. The entries define the address of the interrupt handler and some flags that determine how this handler is called. The IDT is not necessary, but used for debugging the boot assembly code when anything is changed. Normally the GDT is the only necessary piece of the environment that is necessary to go into long mode. However, it is left there for debugging. For more information, consult the main page for the IDT.
Ref: . We then must set up a stack for 32 bit code. We defined the stack in load.s. All we do is set the stack pointer register to point to that address.
We are ready for the leap.
[edit] Jumping to Long Mode
ljmp $CS_KERNEL, $(start64-KERNEL_VMA_BASE)
Ref: . This will jump into the 64 bit code. The left part of the instruction indicates the entry to use in the GDT to describe the environment we are jumping to. This is why the GDT is necessary. The right side is an address. It is jumping to using the lower memory mapping and not the higher memory mapping. (ljmp can only accept a 48 bit address even when in long mode already). We will employ other crazy shenanigans to do this in the next part of the series.
But, alas, after this call, you are in 64 bit long mode.