Singpolyma

Technical Blog

Archive for the "Tech" Category

Liberated Pixel Cup

Posted on

This is my first post about the Liberated Pixel Cup. I only became aware of the contest near the end of last month (indirectly, though a discussion on identi.ca) and was immidiately intrigued. The first phase of the contest (which is completed) involved pixel artists submitting compatible-yet-thematically-diverse isometric art. One of the big blockers to me getting seriously involved in a video game project has always been access to a body of art. Looking over the art that has been submitted, I think there is enough to do something I would be interested in, so I decided to start.

Using my latest favourite hammer, Haskell, as well as previous favourites of mine, SDL (which has good Haskell bindings) and Chipmunk Physics (which has acceptable Haskell bindings, though a bit out of date), I played around with ideas until I hit some inspiration. I am going to attempt to create a PvP brawler (dubbed “BitBrawl”) since this is a sort of game that I have always enjoyed (and of which there are too few!) I’m going to experiment with a twist involving “energy pellets” that players pick up which both decrease their odds of being eliminated (damage increases said odds) and gets used up by special abilities. I’m currently enjoying my new LoL gutter that I bought at unrankedsmurfs.com.

I want the combat mechanics to be of medium complexity (more complex than “hit A repeatedly”, but simpler than something like God of War), and auto-generated maps. Though honestly that’s a bit grandiose at this point. My primary target with the project is really experience, though I hope to also get a playable game out of it that can maybe be improved upon after the competition ends.

The last two days are the first chance I’ve really gotten to do some serious coding, and I’ve made quite a bit of progress, given my lack of experience with the tools and with game development in general. I have a single player who can walk around the screen (using any of the walkcycle art submitted to the competition) using configurable (hard-coded, but in a datastructure) keyboard commands and configurable (I parse a text file) animations.

Character control uses the top-down “Tank demo” physics from Chipmunk, which means that collisions and such will be nicely handled for me (once I get around to adding a second thing to the space).

My next step will be allowing two characters to be walking around. Then maybe a better background than just all-black. Then trivial combat. We’ll see how far I get.

One thing I’ve had to do is make a small patch to the Chipmunk bindings for Haskell to give me access to the maxForce and maxBias properties of constraints (necessary for the “Tank demo” physics). I will submit the patch upstream, but may have to package the library myself for my submission if it does not get mainlined in time.

I have released the code on GitHub under my normal ISC license, as well as the GPLv3.0 license required by the competition. There is no art or data in the repository yet, so it won’t run unless you put some in, but you can take a look. It’s a bit messy and IO-heavy right now, but it is getting cleaner as I go.

Hope to write more later.

Let’s Talk About Strings

Posted on

A string is a sequence of letters and numbers. It is text. Or, is it just any array? The term ‘string’ has been used and abused over time to mean many different things. This has lead to confusion, bugs, and a plethora of conflicting opinions.

The Two Main String Types

There are two things people mainly mean when they say ‘string’. On the one hand, they may mean a data structure representing human-language text. On the other hand, they may mean whatever data structure or type their programming environment often uses for representing human-language text. In many cases, this data structure is also used for other sorts of data, such as raw binary data.

This has lead to much confusion.

In some programming environments, like C, everything is low-level enough that there cannot be expected to be One True Way to handle a high-level concept such as “human-language text”. Other environments, however, should know better.

Encodings

An encoding, in the context of text, is the scheme whereby text is represented in actual bytes. I will talk about three different kinds of encodings: internal, input, and output encodings.

Internal Encoding

In low-level languages (and, sadly, many high-level languages) there is no native support for the high-level concept of ‘text’. The programmer is given a way to represent an array of bytes (also called a byte string) and must decide how to encode text in RAM for his/her own application. This is true of C’s char*, Ruby 1.8’s strings, and many others. Historically, ASCII was the de-facto internal encoding, but this is not generally acceptable if your application will be used for anything except a subset of English. Different internal encodings have different trade-offs, which I’m not really interested in covering here. As with so many other things you probably should not be making this decision. Find a library or a programming environment that handles text and let it deal with how to actually store the data in RAM.

Input Encoding

Input encoding is the only encoding an application programmer should ever have to deal with. Unfortunately, for historical reasons, there are a plethora of encodings out there. You will have to decide a way to know how the file, network socket, or other input you are reading is encoded. Many protocols and file formats have simple markers that will let you know. You have to feed this information to whatever calls you use to get your high-level text representation from the input.

Output Encoding

This is the encoding you use in files you write, prints to stdout, bytes you send over network sockets, etc. If you are writing some existing file format or protocol, you need to see how that format handles specifying what encoding you are using and correctly mark it. This, however, should be the extent of what you need to do to handle output encodings. I’m going to say something some people consider controversial, but I’ve given it a lot of thought, and after working with this stuff in many contexts I’ve come to a conclusion.

There is only one acceptable output encoding, and it is UTF-8.

Always.

There are some people around who will bad-mouth Unicode for some of the problems it had historically (for awhile they used 16-bit-max-width encodings and could not properly handle some Asian languages), but these have been fixed for some time now. The standard is not set in stone, so if bugs are found they can be fixed.

The other thing people complain about is that UTF-8 is unfairly smaller for English text than it is for other text. It is true that language-specific encodings will take less space than UTF-8, however the complexity and potential for bugs that comes from using a plethora of encoding mechanisms as a hobo compression mechanism is not worth it. If you’re concerned about space, then compress your content.

UTF-8 is backwards-compatible with ASCII implementations, such that they will continue to work in a UTF-8 environment (at least as much as they ever worked). This also means that your application can easily handle old ASCII data even if all you implement is UTF-8.

Haskell and Ruby

As a way to illustrate this topic, I will take as examples Haskell and Ruby 1.9.

In Ruby 1.9, strings have an associated encoding. They are objects with a byte string and an encoding property. When you write them out, they are written in whatever encoding you specify. This is a huge improvement over 1.8 (where all encoding decisions were manual), but still exposes more complexity than is necessary to the programmer.

Furthermore, all I/O operations in Ruby 1.9 are done using String. How is binary data read, then? String has an associated encoding called ‘binary’. This is just shameful. The programmer still has to keep track of which String are text and which are byte strings.

In Haskell, the default String type is actual an alias for [Char] (a list of Char) and Char is defined to be a 32-bit Unicode code point. It is unequivocally for text. Binary data can be represented as [Word8] (a list of Word8, that is, a list of bytes).

Because linked lists are not always the most efficient, there are also convenient array-packed-representation libraries for both of these types, intuitively named Text and ByteString.

Unfortunately, because of the confusion that often comes from the shameful state of so much tooling, many Haskellers use ByteString.Char8 to store what should be Text, and handle the internal encoding themselves (often poorly).

I/O in Haskell can easily be done directly with any of these types.

Summary

  1. Use a text library for text, use a raw byte string type for binary data. Do not confuse the two.
  2. If you’re reading an existing format, you’ll have to correctly detect the input encoding and transform to the datastructure used by your text library.
  3. If you’re writing an existing format, you’ll have to correctly identify what output encoding you’re using.
  4. You should only use UTF-8 as your output encoding.

Haskell for Rubyists

Posted on

In the last year I’ve been playing with a new language: Haskell. I have found it to be a very suitable second high-level language for me as a Rubyist, and in this post I will explain (with examples!) some of why I, as a Rubyist, love Haskell, and how you can use it to do awesome things.

Why Another Language?

One thing I wasn’t sure about for a long time was if I even needed another language. Obviously, being able to work in any environment you have thrown at you is an essential job skill, but I mean a language I chose for myself. I am a Rubyist, do I need to be anything else?

Well, while I love Ruby, there are a few reasons I eventually chose to go in search of a second language to call my own:

  1. Performance

    Ruby implementations are getting faster all the time, but we all know there are faster things out there. It’s the reason some things (like the JSON gem) have versions written in C. I felt it would be nice for some tasks to get the performance boost that comes from an optimising compiler, without having to drop all the way to C.

  2. Portability

    Yes, Ruby is super-portable… to systems that have a ruby implementation. It’s somewhat complex to just email arbitrary users a ruby script and hope they can use it without setup help.

  3. Linking with native code

    Ruby extensions and FFIs exist so that we can call C code from Ruby. What about calling Ruby code from C or another language? It can be done, but only if most of MRI is linked in to the target and the Ruby is more-or-less “eval’d”.

In case you haven’t guessed, basically I wanted a nice high-level environment like Ruby, but with a native code output.

Isn’t Haskell Hard?

No. Or at least, not harder than any other language. It is true that the Haskell community has a higher-than-average concentration of Ivory Tower Dwellers. Yes, some of them have been in the Tower for so long that they have forgot how to write anything but symbols from higher-order logics. Yes, the documentation for some really nice Haskell libraries and features are dense academic papers. Don’t let them scare you off. There are humans in the community as well, and # on freenode IRC has many of them.

Type Inference

One of the nice features of Ruby is the type system. If you’re used to un-inferred static typing (read: C) then the ability to write code like this:

def fun(a, b); (a + b) * 3; end

is liberating. Haskell has a static type system, which means that you’ll never have a program crash in production because you’re passing in different data than you though, but only in a case your tests didn’t catch. Unlike C, however, Haskell’s system is strong (which means that data is not magically cast for you, so you get stronger guarantees, just like how in Ruby we must write 1.to_s + "hello" not 1 + "hello"), but more importantly it is inferred, so the equivalent of the above in Haskell is:

fun a b = (a + b) * 3

You can add type annotations (like in C) if you want to, which sometimes helps for clarity, but you don’t need to.

The only limitation here is that data structures are mostly of a single type, for example in Ruby:

a = [1, "hello"]

is perfectly fine. This is sometimes a good thing, and sometimes causes strange bugs. In Haskell, this would be an error, so we need to define unions explicitly:

data StuffInMyList = I Integer | S String
a = [I 1, S "hello"]

A small pain, but I feel it’s a fine trade-off.

Mixins

The mixin module is one of the defining characteristics of Ruby. Haskell has something similar, called Typeclasses, which form the foundation of polymorphism in the language. In Ruby:

module Equality
def equals?(b); self == b; end
end

class Thing
include Equality
end

In Haskell:

class (Eq a) => Equality a where
	isEqual :: a -> a -> Bool
	isEqual x y = x == y

data Thing = Thing deriving (Eq)

instance Equality Thing

This looks a bit different. You’ll note I had to give a type signature to the isEqual function. This is one of the few places you have to, and it has to do with making the polymorphism we get with mixins a bit safer. My Equality mixin has to be restricted to types from the Eq typeclass (because I use == on them), which is also true in Ruby except that in Ruby every single class has == defined.

Significant Whitespace

Haskell has significant whitespace. If you’re a Rubyist on the run from Python this may scare you, but there are two reasons this does not bother me. First, the Haskell whitespace is much nicer than in Python, and the way code gets written in Haskell you rarely have the “where does this huge block end?” problem. Second, the whitespace in Haskell is optional! Here’s that typeclass again, but without the whitespace use:

class (Eq a) => Equality a where { isEqual :: a -> a -> Bool; isEqual x y = x == y; }

Great!

Let’s see a real example!

You may have heard that Haskell I/O is weird, and that Haskell has no access to mutation. While Ruby code is often non-destructive in nature itself, access to mutation is sometimes handy. Understanding why Haskell I/O is safe and such is not terrible, but it does take learning a new concept (called Monads, with roots in those academics, but there are good simple explanations out there without too much math, like in Learn You a Haskell (for Great Good), which I recommend), but doing simple I/O is actually not complicated.

main = do {
text <- readFile "somefile.txt";
print $ length $ lines text;
}

This is the Haskell code to read a text file, split it in to lines, count the number of lines, and print out that number. Pretty simple!

What about mutation? Well, it is true that there are no globals in Haskell, but really, who uses globals? If you really need mutation for something, the simplest way to make a reference is:

import Data.IORef

main = do {
someRef <- newIORef 1;
val <- readIORef someRef;
print val;
writeIORef someRef 12;
val <- readIORef someRef;
print val;
}

Of course, if you want you could make this a bit less verbose:

import Data.IORef

x := y = writeIORef x y
new x = newIORef x
get x = readIORef x

main = do {
someRef <- new 1;
val <- get someRef;
print val;
someRef := 12;
val <- get someRef;
print val;
}

Many Libraries

Haskell has a very active community that has produced many libraries covering all sorts of cases. The main place to look for these is Hackage.

REPL

Another thing that drew me to Ruby initially was irb. The ability to just fire up a shell-like environment and enter expressions, and load in my code and play with it live, is a very nice thing. There are several such environments for Haskell, the one that I prefer is GHCI, which also has commands to set breakpoints and such (which I have never needed) and to find out what the type of some expression is (very handy).

Other Useful Bits

There is a very useful tool for Haskell called hlint, which analyses your code and make (sometimes surprisingly insightful) suggestions. I don’t always agree with it, but it is very nice.

Debug.Trace is a very useful library for printing out arbitrary values from anywhere in your code without otherwise affecting the behaviour of the code. Very useful for debugging.

If you want to learn more, I highly recommend Learn You a Haskell for Great Good.

Writing a Simple OS Kernel — Part 5, Hardware Interrupts

Posted on

Last time we got the kernel switching back and forth between multiple tasks. This time we’re going to get a simple timer to wake up the kernel at fixed intervals.

Motivation

So far we’ve been interacting with hardware when we want to. Writing to the serial port when we have something to print, for example. Why would we want to let the hardware interrupt what we’re doing?

A few reasons:

  1. Currently, our user mode tasks have to call a syscall on purpose in order for any other task to be able to run. This means one task can run forever and block anyone else from running (called starvation). So we at minimum need some way to interrupt running tasks (called preemption).
  2. The hardware knows when it’s ready. It is way more efficient to let the hardware say “I’m ready now” then to keep asking it.
  3. The hardware may actually stop being ready by the time we get around to talking to it again. We want to talk to the hardware during the window when it’s able to talk

Enabling a Simple Timer

Before we get to the serial port, there is an even simpler piece of hardware we can work with first: timers. Timers just take a value and count down to zero, sort of like an alarm clock.

I could make you go look up in the user guides what all the magic numbers for our timers are, but I’m a nice guy, so here’s the ones we need to add to versatilepb.h:

/* http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0271d/index.html */
# TIMER0 ((volatile unsigned int*)0x101E2000)
# TIMER_VALUE 0x1 /* 0x04 bytes */
# TIMER_CONTROL 0x2 /* 0x08 bytes */
# TIMER_INTCLR 0x3 /* 0x0C bytes */
# TIMER_MIS 0x5 /* 0x14 bytes */
/* http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0271d/Babgabfg.html */
# TIMER_EN 0x80
# TIMER_PERIODIC 0x40
# TIMER_INTEN 0x20
# TIMER_32BIT 0x02
# TIMER_ONESHOT 0x01

Now add the following somewhere near the top of your main:

*TIMER0 = 1000000;
*(TIMER0 + TIMER_CONTROL) = TIMER_EN | TIMER_ONESHOT | TIMER_32BIT;

The timer on our QEMU system defaults to using a 1Mhz reference, which means that it ticks 1000000 times per second. So, by setting it’s value to 1000000, it will reach 0 after one second.

Then, on the second line, we enable the timer, set it to “oneshot” mode (that is, it will just stop when it’s done counting, we’ll see another mode later), and set it to 32BIT mode (so that it can hold really big numbers).

Now, somewhere in your while loop, put this:

if(!*(TIMER0+TIMER_VALUE)) {
   bwputs("tick\n");
   *TIMER0 = 1000000;
   *(TIMER0 + TIMER_CONTROL) = TIMER_EN | TIMER_ONESHOT | TIMER_32BIT;
}

We’re just printing out “tick” if the timer has reached 0, and then resetting and re-enabling the timer.

Build and run your kernel. It should still work as before, but now it will also print out “tick” once per second.

Back to the Vector Interrupt Table

The CPU uses the same mechanism for hardware interrupting operation (often referred to as an “interrupt request” or IRQ) as it used know where to jump when we used svc. We need to add an entry to the right place. Here’s the new code for bootstrap.s:

interrupt_table:
ldr pc, svc_entry_address
nop
nop
nop
ldr pc, irq_entry_address
svc_entry_address: .word svc_entry
irq_entry_address: .word irq_entry
interrupt_table_end:

IRQ entry point

Back over in context_switch.s, irq_entry is going to be very similar to svc_entry, but with some changes to handle an IRQ instead of a syscall.

As soon as we enter system mode, we need to push the value of r7 onto the task’s stack (because that’s what the syscall wrapper does, and we need to set things up to look the same). The syscall puts the number identifying the syscall into r7, so we should come up with something that will identify an IRQ and put that there instead. To make sure the number never clashes with a syscall number, lets use negative numbers:

push {r7} /* Just like in syscall wrapper */
/* Load PIC status */
ldr r7, PIC
ldr r7, [r7]
PIC: .word 0x10140000
/* Most significant bit index */
clz r7, r7
sub r7, r7, #31

That value at PIC is the address of the interrupt controller status (that is, the value that can tell us which interrupts are currently firing) and then we just do some operations on that value to find out what the most significant bit set in the value is, and transform that into a negative index to use for the identifier of this IRQ.

The only other weird thing is that we have to get the value of the lr from IRQ mode instead of Supervisor mode, and the value is set to the instruction after the one we should be returning to, so we’ll need to subtract 4:

msr CPSR_c, # /* IRQ mode */
mrs ip, SPSR
sub lr, lr, # /* lr is address of next instruction */
stmfd r0!, {ip,lr}

So those bits, together with what we already had from svc_entry give us this complete entry point code:

.global irq_entry
irq_entry:
	/* Save user state */
	msr CPSR_c, # /* System mode */

	push {r7} /* Just like in syscall wrapper */
	/* Load PIC status */
	ldr r7, PIC
	ldr r7, [r7]
	PIC: .word 0x10140000
	/* Most significant bit index */
	clz r7, r7
	sub r7, r7, #31

	push {r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,r10,fp,ip,lr}
	mov r0, sp

	msr CPSR_c, # /* IRQ mode */
	mrs ip, SPSR
	sub lr, lr, # /* lr is address of next instruction */
	stmfd r0!, {ip,lr}

	msr CPSR_c, # /* Supervisor mode */

	/* Load kernel state */
	pop {r4,r5,r6,r7,r8,r9,r10,fp,ip,lr}
	mov sp, ip
	bx lr

A Small Problem

We’re pushing the value of r7, but where is it going to get popped? The syscall wrappers do both the push and the pop, but in the case of an IRQ we’re pushing in the entry point, and we have no easy way to tell activate that it’s exiting from an IRQ instead of a syscall. Hmm.

The solution to this that I’ve chosen is to remove the pop instruction from the syscall wrappers, and add an identical instruction to right after the pop instruction already in activate. Now the pop always happens.

More Magic Numbers

We hardcoded a single magic number into our assembly, but luckily that’s the only one our assembly code will need. In fact, we should not need any more assembly! (Unless we wanted to implement other sorts of hardware exceptions, which we may get to.) We do need that and a few other interrupt-controller related magic numbers in our C code too, so here’s the stuff for versatilepb.h:

/* http://infocenter.arm.com/help/topic/com.arm.doc.dui0224i/I1042232.html */
# PIC ((volatile unsigned int*)0x10140000)
# PIC_TIMER01 0x10
/* http://infocenter.arm.com/help/topic/com.arm.doc.ddi0181e/I1006461.html */
# VIC_INTENABLE 0x4 /* 0x10 bytes */

Enable the Interrupt

We have to tell the timer to generate interrupts, and we have to tell the interrupt controller to pass those interrupts through. So, replace your current one-shot timer code with this:

*(PIC + VIC_INTENABLE) = PIC_TIMER01;

*TIMER0 = 1000000;
*(TIMER0 + TIMER_CONTROL) = TIMER_EN | TIMER_PERIODIC | TIMER_32BIT | TIMER_INTEN;

Note that we’re also using a new timer mode here. Periodic mode will automatically cause the timer to restart from the value we set when it hits 0, which is perfect for when we’re getting an interrupt instead of waiting until the value hits 0.

Handle the Interrupt

Now we have to add a case to our switch statement for this request:

case -4: /* Timer 0 or 1 went off */
	if(*(TIMER0 + TIMER_MIS)) { /* Timer0 went off */
		*(TIMER0 + TIMER_INTCLR) = 1; /* Clear interrupt */
		bwputs("tick\n");
	}

The way our hardware is set up, the interrupt is actually the same for either timer 0 or timer 1. Now, we know that we’ve only set timer 0, but that’s lazy. TIMER_MIS is the offset where we’ll find a flag telling us if Timer0 was the one that is causing the interrupt. After checking that, we set a value on the timer to clear the interrupt (basically to say that we’ve seen it so that it won’t call us again unitl the next interrupt). Finally, print out “tick”.

As it is, this code should operate exactly the same as the version without the interrupts.

Pre-emption

Up until now, we’ve needed the dummy syscall syscall in an infinite loop at the end of every task in order to let other tasks run. Since the timer interrupts the tasks now, we don’t need them to interrupt themselves. Remove all references to the syscall syscall and try running the kernel again. Now you’ll note that the first task just stays in its infinite loop and does not let the other task keep running. Does not, that is, until the timer goes off and you see “tick” printed for the first time, after which the tasks switch. So the tasks are now being properly interrupted even if they do not give up control!

We’re done!

The kernel now has all the machinery it needs to be able to handle hardware interrupts, and tasks are being pre-empted by the timer. Next time we’ll start looking at IPC.

The code so far is on GitHub.

Writing a Simple OS Kernel — Part 4, Multitasking

Posted on

Added a description of fork, 2012-36.

Last time we got our kernel switch back and forth between a user mode task. The focus in this post is going to be adding the ability to run multiple tasks.

Hack Two Tasks in There

Our kernel currently has no way for tasks to exit, so we need to make sure they never try to return, since that will probably cause an error and hang the system (in the best case). So change the syscall invocation at the end of first to be in an infinite loop, such that if the task gets re-activated after it is done it just immediately calls back into the kernel.

Then, add a second function so that we have something else to run:

void task(void) {
	bwputs("In other task\n");
	while(1) syscall();
}

We’re going to run two tasks, so change the first_stack and first_stack_start to something like:

unsigned int stacks[2][256];
unsigned int *tasks[2];

And setup both tasks, in basically the same way as we did before:

tasks[0] = stacks[0] + 256 - 16;
tasks[0][0] = 0x10;
tasks[0][1] = (unsigned int)&first;

tasks[1] = stacks[1] + 256 - 16;
tasks[1][0] = 0x10;
tasks[1][1] = (unsigned int)&task;

Then, call activate with tasks[0] and tasks[1] instead of first_stack_start. Recompile and run, you should see both tasks running in the order that you activate them.

Some abstractions

We’ve hardcoded a bunch of stuff again. We should clean it up so that we don’t have to keep copying these chunks every time we want a new task. Let’s move those magic numbers into constants and make a function to set up a new task:

# STACK_SIZE 256 /* Size of task stacks in words */
# TASK_LIMIT 2   /* Max number of tasks we can handle */

unsigned int *init_task(unsigned int *stack, void (*start)(void)) {
	stack += STACK_SIZE - 16; /* End of stack, minus what we're about to push */
	stack[0] = 0x10; /* User mode, interrupts on */
	stack[1] = (unsigned int)start;
	return stack;
}

Then, back in main clean up the task initialisation to use these abstractions:

unsigned int stacks[TASK_LIMIT][STACK_SIZE];
unsigned int *tasks[TASK_LIMIT];

tasks[0] = init_task(stacks[0], &first);
tasks[1] = init_task(stacks[1], &task);

Scheduling

Scheduling (deciding what tasks to run, when, and for how long, in a multitasking system) is a big topic. We are going to implement the simplest possible (in our situation) schedule: round-robin. This means that we’ll just activate each task in the order that they were created, and then go back to the beginning and activate each in sequence again.

Since we have no way for tasks to exit, make sure the following is at the end of each task:

while(1) syscall();

This will just cause the task to repeatedly call back into the kernel when it is done.

We’ll also need to keep track in main of how many tasks there are and which one we’re currently running:

size_t task_count = 0;
size_t current_task = 0;

In order to use size_t you’ll need to include stddef.h. Now just make sure that task_count gets set correctly, and replace the code that activates user tasks with this:

while(1) {
	tasks[current_task] = activate(tasks[current_task]);
	current_task++;
	if(current_task >= task_count) current_task = 0;
}

This just activates each task in order, repeatedly, forever. Fairly simple.

You may be wondering why I didn’t use the classic current_task = (current_task + 1) % task_count trick. The reason is that, at least on ARMv6, gcc actually generates a library call for the % operator. You could try linking in libgcc, but there are some problems. For this case, it just wasn’t worth it, and so I’m using an if statement instead.

Code so far on GitHub

Setup for new Syscall

So, our syscall syscall is cute, but it’s not very useful. For good multitasking, we want a way for a user mode task to create new tasks. We’ll do this the way Unix does: fork. This syscall copies the current process, and then returns the ID of the new process to the parent, and 0 to the child.

We need a way for the kernel to know what the user mode task wants it to do. As I mentioned in a previous post, this is what the “argument” to svc is for. We could add code to our context switch to mask the bits off of the end of the svc instruction, but let’s not bother our context switch now. We’re going to store the id of our syscall in a register. syscalls.s is now:

.global syscall
syscall:
	push {r7}
	mov r7, #
	svc 0
	pop {r7}
	bx lr

.global fork
fork:
	push {r7}
	mov r7, #
	svc 0
	pop {r7}
	bx lr

That’s two syscalls, and the context switch will save the new value of r7 on the top of the stack, which we can read in the kernel. The actual value of r7 gets saved and restored by the syscall wrapper.

In order to call our new syscall, we’ll need to add the following to asm.h:

int fork(void);

memcpy

As you may have guessed, we’re going to implement the fork syscall, in order to make our multitasking more actually useful. To do that, we’re going to need to copy stuff. Because we’re not linking in libc at this point, we need our own memcpy, like so:

void *memcpy(void *dest, const void *src, size_t n) {
	char *d = dest;
	const char *s = src;
	size_t i;
	for(i = 0; i < n; i++) {
		d[i] = s[i];
	}
	return d;
}

Forking

We’re going to use fork to spawn task from first. So, move task up above first, and replace the first call to syscall with:

if(!fork()) task();

Then, modify main so that only first gets set up to run.

The Actual Syscall

So, we’re jumping into the kernel now, but so far we don’t actually do anything with this new syscall. How can we even tell which syscall is being called? Well, remember that the syscall id is in r7 when the context switch happens, so by the time we get to the kernel, we can access it like so:

switch(tasks[current_task][2+7]) {
	case 0x1:
		bwputs("fork!\n");
	break;
}

What’s that 2+7 for? Well, remember, we push SPSR and the supervisor mode lr on the top of the stack, so we have to move past those to get to the registers.

Our kernel has a limitation. Specifically, TASK_LIMIT. If we call fork when there is no space for a new task, that would be bad, so we should return an error:

if(task_count == TASK_LIMIT) {
	tasks[current_task][2+0] = -1;
} else {
}

What are we doing here? We’re changing the saved value of r0, which, you will recall, is the return value of a function. So, we are setting the return value that the user mode task will see to -1.

Now, what does fork actually need to do? It copies all the state from one task into a new one. The new task will need a stack pointer, and said stack pointer will need to be exactly as far from the end of the stack as the current task’s stack pointer:

size_t used = stacks[current_task] + STACK_SIZE - tasks[current_task];
tasks[task_count] = stacks[task_count] + STACK_SIZE - used;

Now we actually have to copy the stack over. Luckily, we know exactly how much to copy, since it’s the same as the distance from the stack pointer to the end:

memcpy(tasks[task_count], tasks[current_task], used*sizeof(*tasks[current_task]));

fork is specified to return the new PID in the parent process, and 0 in the child process:

tasks[current_task][2+0] = task_count;
tasks[task_count][2+0] = 0;

And, finally, we should probably record the fact that there’s a new task:

task_count++;

That’s it!

We now have a multitasking kernel (remember to increase TASK_LIMIT if you want to run more tasks!), and user mode tasks can start new tasks. Next time: hardware interrupts!

Code for this post is on GitHub.