Basic OS writing I : the bootloader

The operating system on a computer has many roles, which can be present or not depending on the OS. The most important ones are to serve as intermediary between programs and the hardware, the management of multitasking to allow more than one program to run at the same time and the management of the memory.

While modern day OS are a fairly big affair, rather impossible to write down as a single person (over 1 GB these days), an OS by itself doesn't require so much work. The original MS-DOS source code is only a few thousands of lines of assembly code, while a fairly decent looking OS such as Kolibri OS can hold on as little as the size of a floppy disk at 1.44 MB.

So let's see some basic elements of OS writing. The OS Dev Wiki is quite an invaluable ressource on the topic which I recommend reading to help on the matter.

Development setup

Writing an OS requires a fair amount of booting the computer, which means that it will be quite useful to use a virtual computer such as Virtual Box.

As this will not be a terribly complicated operating system, I will write everything in x86 assembly, which is the most common type of processor for desktop PCs. To make the x86 assembly program into a usable OS, I recommend first compiling it with NASM, with the bash command

nasm -f bin -o OS.bin OSCode.asm

With the x86 sourcecode OSCode.asm, the resulting executable file OS.bin, and the format (-f) bin, corresponding to a raw binary executable.

To then use it with virtual box, you will need to create a virtual floppy disk, using the bash command dd

dd if=/dev/zero of=floppy.img bs=1024 count=1440
dd if=OS.bin of=floppy.img conv=notrunc

The first line will output 1440 empty blocks of 1024 bytes (1.44 MB, the standard size of a floppy disk) to the virtual floppy, while the second will output replace the beginning of the floppy disk (or all of it, depending on the size of the program, of course) with the executable file. This will produce a virtual floppy disk that the virtual PC can read.

The bootloader

When a computer with the x86 processor boots up, after running whatever it needs to do for itself such as the POST, it will look for a bootable device. It will run through a list of ports (in an order defined by the BIOS), until it finds one which marks itself as a boot device. This is indicated by the boot signature, or magic number : in the sector 0 of the device (the first 512 bytes of the device), the last two bytes, at position 510 and 511, are the numbers 0x55 and 0xAA.

Once that device is found, the sector 0 is uploaded to the RAM, at the address 0x7C00:0x0000 (segment 0 at the address 0x7C00), and the processor will then operate normally, reading the code at this address.

The first step of an operating system will then first be to upload itself to the RAM, using these 512 bytes to do this. The program to do this is called the bootloader. There are two main ways to do this, depending on its complexity.

A single stage bootloader is entirely contained in those 512 bytes, and will load the operating system in the computer directly.
A two-stage bootloader will first upload a simple 512 bytes boot, which has the sole purpose of uploading the more complex bootloader which will itself load the operating system.

For now, let's avoid complications and work with a single stage bootloader, which will not particularly check for the configuration of the computer itself, or be very prepared to deal with errors.

Now the code. First, it's important to know that all x86 computers boot up as if it were still 1979. For reasons of backward compatibility stretching all the way back to its first incarnations, the x86 CPU starts out in a mode called "real mode", making it work much like the 8088 microprocessor. It works in 16 bits, can only access 1 MB of RAM and has no virtual memory, but on the other hand it can use the native BIOS functions of the interrupt table. Ralf Brown's Interrupt List is a very nice overview of those functions.

Because of this, the first line of code will be [BITS 16], indicating that we will only deal with the 16 bits mode for now. The second line will be [org 0x7C00], which indicates where the first line of the program is located in the RAM, 0x7C00 being the location the bootloader is loaded in by the CPU. After that, the actual program can start.

The most important thing we need to do is to load the operating system into memory from the floppy disk. This can be achieved with an interrupt call to read sectors into the memory, which is called by the instruction int 0x13, calling a function from the interrupt table 13, containing mostly functions relating to memory devices. We need to fill the following registers :

The AH register needs to be set at 0x02 to call the specific function we want, while the AL register needs to be set to the number of sectors to read. So all in all, we need to set the AX register to 0x0201 if we wish to read a single sector.
The RAM address to write to is made from the registers ES:BX. It's generally a good idea to write the operating system in the upper half of the RAM, leaving the lower half to run programs, and the bootloader already occupies addresses 0x7C00 to 0x7CFF. To give it a nice round number let's put it at 0x8000:0x0000.
The head to read from and drive number are in the registers DH and DL, while the floppy sector is in the first 6 bits of CL, and the floppy cylinder is 10 bits spread over CX as the following :

	CH								CL
Cylinder	c₇	c₆	c₅	c₄	c₃	c₂	c₁	c₀	c₉	c₈
Sector											s₅	s₄	s₃	s₂	s₁	s₀

(if the cylinder number varies, setting CX becomes more complicated)

A 3.5'' 1440 kB floppy disk (or high-density 3.5'' floppy) has 2 sides, 80 tracks per side and 18 sectors per track of 512 bytes, for a total of

$$2 \times 80 \times 18 \times 512 = 1440 \times 1024 = 1,474,560 \text{B}$$

If we wish to read the entire content of the floppy disk, we'll have to loop over every such value. The numbering is somewhat inconsistent : sectors in AH go from 1 to 18, cylinders in CH go from 0 to 79, and heads in DH from 0 to 1. Keep in mind that for the first head and the first cylinder, the first sector will be the bootloader, which we don't need, hence the first sector loaded will be head 0, cylinder 0, sector 2. With every sector read, we'll need to increment the target address in BX by 512, and, if it overflows, the segment address ES.

However, we should not load too far into the RAM : past the address 0xA000:0x0000, all addresses point to the system memory, such as the video RAM. To avoid overwriting it, we will keep to addresses between the segments 0x8000 and 0x9FFF, for a total of 8 kB. This corresponds to 16 sectors, with any further sector necessary for the operating system being loaded to the hard drive later on.

This will simplify things somewhat for the selection of cylinders and sectors : all sectors can be loaded from cylinder 0. With 18 sectors and the bootloader in sector 1, this leaves us with one extra sector. We'll use sector 2 to store the hard drive bootloader for later on, and the OS itself will be loaded from sector 3 to 18, meaning that CX will only vary from 0x0003 to 0x0012.

The function to read sectors to the RAM also returns some parameters. To avoid delving too much into the error handling of the hardware, let's just consider the carry flag CF : CF will be set if any error occurs during the transfer, in which case we may want to restart the process.

[BITS 16]
[org 0x7C00]

_OS_SEGMENT		equ	0x0800
_SEG_PER_TRACK		equ	0x12
_TRACK_PER_SIDE		equ	0x50
_SEG_SIZE		equ	0x0200
_OS_FLOPPY_SEGMENT	equ	0x0003


start:		cli

		mov	ax,	_OS_SEGMENT
		mov	es,	ax


		xor	bx,	bx
		mov	cx,	_OS_FLOPPY_SEGMENT
		xor	dx,	dx

loadKernel:	mov	ax,	0x0201
		int	0x013
		jc	loadKernel
		inc	cl

		add	bx,	_SEG_SIZE
		cmp	cl,	_SEG_PER_TRACK
		jl	loadKernel

		jmp	_OS_SEGMENT:0x0000



		TIMES 510 - ($ - $$) db 0
		DW 0xAA55  

		TIMES 512 db 0

%include 	"OS"

The first step is to clear the interrupt flag with cli, to avoid any hardware interruption disrupting the bootloading process.

[...]

After the bootloader comes the code for the operating system proper. For now we can just put a simple bit of code which just fills the screen white to check that our code works properly, perhaps with a little logo. First we need to signal that this current section of code is placed at a different address than the bootloader, using the tag [SEGMENT 0x8000], so that the labels will use the proper segment in their addressing.

To use the graphical display, while we're still in real mode, we can use the interrupt int 0x10 to switch the video mode, with AH set to 0x00 (to set the video mode), and AL to 0x13, to specifically set the 320 x 200 mode with 256 colors. This video mode has its VRAM at address 0xA000.

Then we just upload the desired color to the appropriate memory location. As the screen is 320 x 200 pixels wide, that will span from 0xA000 to 0xE000, which we can fill with the instruction rep stosd, which fills all memory locations starting at ES:BX to ES:(BX + CX - 1) with the double word in EAX. The value of white in the default palette is 0x0F, meaning we need to set the ES register at 0xA000, BX to 0x0000, CX to 0x4000 and EAX to 0x0F0F0F0F.

[BITS 16]
[SEGMENT 0x8000]
		mov     ax,     0x0013  
		int     0x10            
        	mov     bx,     0xA000
        	mov     es,     bx
  
		xor     di,     di
                mov     eax,    0x0F0F0F0F
                mov     cx,     0x4000
                rep     stosd

The hard-drive bootloader

As mentionned, we reserved the sector 2 of cylinder 0 for the bootloader of the hard-drive, which will be more convenient to boot from than the floppy disk.

The process is essentially the same : if there's no better booting candidate (that is, once we removed the booting disk), the CPU will look at sector 0 of the hard drive, which should also have the magic number. The main difference will be the

Last updated : 2017-09-03 16:43:20

Tags : operating-system , assembly , x86