Added a description of fork, 2012-36.

Last time we got our kernel switch back and forth between a user mode task. The focus in this post is going to be adding the ability to run multiple tasks.

Hack Two Tasks in There

Our kernel currently has no way for tasks to exit, so we need to make sure they never try to return, since that will probably cause an error and hang the system (in the best case). So change the syscall invocation at the end of first to be in an infinite loop, such that if the task gets re-activated after it is done it just immediately calls back into the kernel.

Then, add a second function so that we have something else to run:

void task(void) {
	bwputs("In other task\n");
	while(1) syscall();
}

We’re going to run two tasks, so change the first_stack and first_stack_start to something like:

unsigned int stacks[2][256]; unsigned int *tasks[2];

And setup both tasks, in basically the same way as we did before:

tasks[0] = stacks[0] + 256 - 16; tasks[0][0] = 0x10; tasks[0][1] = (unsigned int)&first;

tasks[1] = stacks[1] + 256 - 16; tasks[1][0] = 0x10; tasks[1][1] = (unsigned int)&task;

Then, call activate with tasks[0] and tasks[1] instead of first_stack_start. Recompile and run, you should see both tasks running in the order that you activate them.

Some abstractions

We’ve hardcoded a bunch of stuff again. We should clean it up so that we don’t have to keep copying these chunks every time we want a new task. Let’s move those magic numbers into constants and make a function to set up a new task:

#define STACK_SIZE 256 /* Size of task stacks in words */
#define TASK_LIMIT 2   /* Max number of tasks we can handle */

unsigned int *init_task(unsigned int *stack, void (*start)(void)) {
	stack += STACK_SIZE - 16; /* End of stack, minus what we're about to push */
	stack[0] = 0x10; /* User mode, interrupts on */
	stack[1] = (unsigned int)start;
	return stack;
}

Then, back in main clean up the task initialisation to use these abstractions:

unsigned int stacks[TASK_LIMIT][STACK_SIZE]; unsigned int *tasks[TASK_LIMIT];

tasks[0] = init_task(stacks[0], &first); tasks[1] = init_task(stacks[1], &task);

Scheduling

Scheduling (deciding what tasks to run, when, and for how long, in a multitasking system) is a big topic. We are going to implement the simplest possible (in our situation) schedule: round-robin. This means that we’ll just activate each task in the order that they were created, and then go back to the beginning and activate each in sequence again.

Since we have no way for tasks to exit, make sure the following is at the end of each task:

while(1) syscall();

This will just cause the task to repeatedly call back into the kernel when it is done.

We’ll also need to keep track in main of how many tasks there are and which one we’re currently running:

size_t task_count = 0; size_t current_task = 0;

In order to use size_t you’ll need to include stddef.h. Now just make sure that task_count gets set correctly, and replace the code that activates user tasks with this:

while(1) {
	tasks[current_task] = activate(tasks[current_task]);
	current_task++;
	if(current_task >= task_count) current_task = 0;
}

This just activates each task in order, repeatedly, forever. Fairly simple.

You may be wondering why I didn’t use the classic current_task = (current_task + 1) % task_count trick. The reason is that, at least on ARMv6, gcc actually generates a library call for the % operator. You could try linking in libgcc, but there are some problems. For this case, it just wasn’t worth it, and so I’m using an if statement instead.

Code so far on GitHub

Setup for new Syscall

So, our syscall syscall is cute, but it’s not very useful. For good multitasking, we want a way for a user mode task to create new tasks. We’ll do this the way Unix does: fork. This syscall copies the current process, and then returns the ID of the new process to the parent, and 0 to the child.

We need a way for the kernel to know what the user mode task wants it to do. As I mentioned in a previous post, this is what the “argument” to svc is for. We could add code to our context switch to mask the bits off of the end of the svc instruction, but let’s not bother our context switch now. We’re going to store the id of our syscall in a register. syscalls.s is now:

.global syscall
syscall:
	push {r7}
	mov r7, #0x0
	svc 0
	pop {r7}
	bx lr

.global fork
fork:
	push {r7}
	mov r7, #0x1
	svc 0
	pop {r7}
	bx lr

That’s two syscalls, and the context switch will save the new value of r7 on the top of the stack, which we can read in the kernel. The actual value of r7 gets saved and restored by the syscall wrapper.

In order to call our new syscall, we’ll need to add the following to asm.h:

int fork(void);

memcpy

As you may have guessed, we’re going to implement the fork syscall, in order to make our multitasking more actually useful. To do that, we’re going to need to copy stuff. Because we’re not linking in libc at this point, we need our own memcpy, like so:

void *memcpy(void *dest, const void *src, size_t n) {
	char *d = dest;
	const char *s = src;
	size_t i;
	for(i = 0; i < n; i++) {
		d[i] = s[i];
	}
	return d;
}

Forking

We’re going to use fork to spawn task from first. So, move task up above first, and replace the first call to syscall with:

if(!fork()) task();

Then, modify main so that only first gets set up to run.

The Actual Syscall

So, we’re jumping into the kernel now, but so far we don’t actually do anything with this new syscall. How can we even tell which syscall is being called? Well, remember that the syscall id is in r7 when the context switch happens, so by the time we get to the kernel, we can access it like so:

switch(tasks[current_task][2+7]) {
	case 0x1:
		bwputs("fork!\n");
	break;
}

What’s that 2+7 for? Well, remember, we push SPSR and the supervisor mode lr on the top of the stack, so we have to move past those to get to the registers.

Our kernel has a limitation. Specifically, TASK_LIMIT. If we call fork when there is no space for a new task, that would be bad, so we should return an error:

if(task_count == TASK_LIMIT) {
	tasks[current_task][2+0] = -1;
} else {
}

What are we doing here? We’re changing the saved value of r0, which, you will recall, is the return value of a function. So, we are setting the return value that the user mode task will see to -1.

Now, what does fork actually need to do? It copies all the state from one task into a new one. The new task will need a stack pointer, and said stack pointer will need to be exactly as far from the end of the stack as the current task’s stack pointer:

size_t used = stacks[current_task] + STACK_SIZE - tasks[current_task]; tasks[task_count] = stacks[task_count] + STACK_SIZE - used;

Now we actually have to copy the stack over. Luckily, we know exactly how much to copy, since it’s the same as the distance from the stack pointer to the end:

memcpy(tasks[task_count], tasks[current_task], used*sizeof(*tasks[current_task]));

fork is specified to return the new PID in the parent process, and 0 in the child process:

tasks[current_task][2+0] = task_count; tasks[task_count][2+0] = 0;

And, finally, we should probably record the fact that there’s a new task:

task_count++;

That’s it!

We now have a multitasking kernel (remember to increase TASK_LIMIT if you want to run more tasks!), and user mode tasks can start new tasks. Next time: hardware interrupts!

Code for this post is on GitHub.

Updated 2012-33 to fix a bug in the context switch assumptions.
Edited for clarity, 2012-34.

Last time, we got our kernel to set up space for a task, and run that task in user mode. This time we’re going to add a facility for the user mode task to call back into the kernel.

Syscalls

But wait! Didn’t I say last time that one of the things user mode tasks cannot do is change the CPU mode? Well, yes, they can’t change it directly. What they can do is trigger an event that will make the CPU switch to supervisor mode and then jump to some predefined bit of code. That way, there’s no security problem, only kernel code is running in supervisor mode, but the user mode task can still ask the kernel to do things for it.

The instruction that lets us do this on an ARMv6 CPU is svc (used to be called swi). It takes an immediate value as an “argument”, which it actually does nothing with. If the kernel wants to use that number for something (like specifying what the user mode task wants done), it has to read the number right out of the instruction in RAM. This is doable, but not always ideal, and so some kernels (such as modern Linux) actually just always use a zero, and then store information they want to pass elsewhere.

Vector Interrupt Table

The svc instruction actually causes an interrupt. Interrupts are signals that (when they’re enabled) cause the CPU to switch modes and jump to some predefined instruction. What instruction will it jump to? Well, ARM CPUs have something called the vector interrupt table that is at the very start of RAM. These locations are where it will jump to (each word, starting at 0x0, is the location of a particular sort of interrupt, up until there are no more interrupt types).

That may not seem very useful. We can only execute one instruction? Well, yes, but that instruction can jump us to somewhere more useful. Now, your first thought may be to put a branch instruction there. Great idea, but it won’t work. Branch instructions are calculated relative to their position in the linked binary. If we copy one of them to another location in RAM, the offset will be wrong. We need to use a load instruction to load an absolute value into the program counter. What value should we load? The address of a function in our kernel, of course! It turns out that the assembler contains a syntax for including the address of a function directly, which is then an absolute value and so does not move when we copy it.

Here’s what the new bootstrap.s looks like:

.global _start
_start:
	mov r0, #0x08
	ldr r1, =interrupt_table
	ldr r3, =interrupt_table_end
keep_loading:
	ldr r2, [r1, #0x0]
	str r2, [r0, #0x0]
	add r0, r0, #0x4
	add r1, r1, #0x4
	cmp r1, r3
	bne keep_loading

	ldr sp, =0x07FFFFFF
	bl main

interrupt_table:
	ldr pc, svc_entry_address
	svc_entry_address: .word svc_entry
interrupt_table_end:

Why do we need to copy two instructions? Well, even that load instruction is loading from a relative address. Luckily if we move them both then the relative position remains the same. This may look a bit complicated, and that’s because it is. You could just copy the two words directly across, but this way we have a loop that copies everything from our interrupt table section, so we can easily add other interrupt handlers to it later. You’ll note we started at 0x8 instead of 0x0, because that’s where the SVC handler is, so if we want to add ones that come before 0x8, we just have to remember to change the base address at the start. With some hacks we could do the copying part in C, but for this example I decided that keeping all the bits for this in assembly was easiest.

Syscall Wrapper

We need a way for our C code to call the svc instruction. Since we don’t have anything we really need to pass through, we’ll just add a dummy wrapper for now.

Create a new file called syscalls.s and add it to the Makefile as a dependency of kernel.elf. Put this in it:

.global syscall
syscall:
	svc 0
	bx lr

Pretty simple. It just activates the svc, and then when it comes back jumps to the caller (note that this assumes the registers it had before it called svc are reset before it comes back).

You’ll also need to add a line to asm.h:

void syscall(void);

Save Kernel State

Now, let’s think about what we want the kernel to do when it actually gets control back. We’d like to go back to the point in our kernel code we left off when calling activate in the first place. The problem is, we’ve not got any clue where that was! In loading the user mode task state, we did not keep around any information about what state the kernel was in. Let’s add that to activate now:

mov ip, sp push {r4,r5,r6,r7,r8,r9,r10,fp,ip,lr}

Put that at the top of activate, before we begin messing about with the registers. That’s all the kernel state we need to save!

SVC Entry

Alright, we’re now ready to define our first version of the svc_entry function. All we really want to go at this point is get the kernel back into a state where it can run, and then return to our code. We’ll put this in context_switch.s:

.global svc_entry
svc_entry:
	pop {r4,r5,r6,r7,r8,r9,r10,fp,ip,lr}
	mov sp, ip
	bx lr

Just reverse the save we just did, and jump back to where we came from in the kernel. How is lr where we came from? Well, you’ll have noted that when we call functions (and the C compiler does this as well), we use bl, which saves the address of the instruction after itself in lr before jumping to the function.

Alright, add a call to syscall to the end of first and then add another bwputs call after the activate call in main, build, and run. You should see your new message printed last.

Code so far on GitHub

Heading Back to User Mode

Add another print and another syscall to first, and another print and call to activate to main. Compile and run your code. What do you see?

When we call activate again, the user mode task re-starts at the beginning! That’s not what we want, but it makes sense. We never saved our place inside the user mode task. In fact, we don’t even have a reference to where its stack was when it called the syscall, so we’re just going back with the stack from before. That’s not going to work. We need to add some code to our SVC entry to save the user mode task’s state. If you recall the way we set up the task before, you’ll now see why. The way the stack looks after we save our state on it is exactly the same as the way we set it up! Everything will be where activate expects it:

msr CPSR_c, #0xDF /* System mode */ push {r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,r10,fp,ip,lr} mov r0, sp msr CPSR_c, #0xD3 /* Supervisor mode */

mrs ip, SPSR stmfd r0!, {ip,lr}

There, all the registers are on the user stack. We use stmfd which is just like push, but lets us operate on another register. You’ll note we had to save both versions of lr. One is state for the syscall wrapper to use, the other is the address inside the syscall wrapper we need to jump back to later. Now we just need some way to get the new location of the top of the stack back to the kernel.

Conveniently, our C code expects the return value of a function to be in r0, which is where I’ve put the user mode sp in this example. Just change the definition of activate in asm.h to return the right type, and then assign the return value somewhere and pass that to the next call to activate. You can now keep calling into your user mode task as many times as you want!

That’s It!

We now have a kernel that can start a user mode task, and switch back and forth between the kernel and the user mode task at will. Next time we’ll look at multitasking and maybe some other stuff.